I'm currently developing an application with jGlobus (GridFTP) that enables the
transfer of "lots of small files" (LOSF). It enables the user to select a
directory in an Endpoint and point the destination directory in another
endpoint. Once the user's access to both directories in the Endpoints is
checked, my application lists all files and subfiles inside the source
directory and populates a String[] with each file name considering it's
canonical name, such as:
Another String[] is populate to indicate the directory and names each file in
sourceFile array should assume in the destination. This way I have two arrays
as required by GridFTPClient's method extendedMultipleTransfer() (I'm using the
complete method alternative, sending offsets and files sizes). Everything works
like a charm when I send either a single file or a list of files, but when the
such list is very big the extendedMultipleTransfer() method behaves in a weird
way.
When sending an array describing more then a thousand files the method simply
stops responding after a while, not throwing exceptions, neither finishing in a
wrong way. It basically starts the transfer, transfer around a 100 files and
hangs. I tried transferring 800 files, 850 files, 950 files and other random
numbers and it works fine, all files are transferred, but when my list has over
a thousand files it hangs.
Usually it transfers around 100 files before hanging, but this behaviour is not
stable, since I also had random numbers of transferred files (such as 77, 96,
107, 104) before the method freezes. In my test I'm using generated bulk files
of 2M each. I also already checked if the endpoints have free disk space
I used netstat to watch open tcp connections, and the application is correctly
openning the amount of TCP connections I asked it to use (4 connections) they
remaing active even after the method begins hanging and eventually I have to
force the end of the transfer manually through command line.
I'm running CentOS 6.3 in both endpoints and the application itself runs over
Fedora 20. My application uses the following jglobus maven dependencies:
My call to the extendedMultipleTransfer method:
OBS: classes gridFtpClientSrc and gridFtpClientDst are instances of the
GridFTPClient object.
//setting passive and active FTP modes
gridFtpClientSrc.setPassive();
HostPort hp = gridFtpClientDst.setPassive();
gridFtpClientSrc.setActive(hp);
//Call to initialize the transfer
gridFtpClientSrc.extendedMultipleTransfer(
resumeOffsets, //long[] object with offsets
resumeSizes, //source files lengths
sourceFiles, //String[] with source files paths
gridFtpClientDst, //GridFTPClient object for destination
resumeOffsets, //long[] object with offsets
destinationFiles, //String[] with destination files paths
new MarkerListener() {
public void markerArrived(Marker arg0) {
// ommited simple control method
}
},
new MultipleTransferCompleteListener(){
public void transferComplete(MultipleTransferComplete mtc) {
// ommited simple control method
}
}
);
I'm currently developing an application with jGlobus (GridFTP) that enables the transfer of "lots of small files" (LOSF). It enables the user to select a directory in an Endpoint and point the destination directory in another endpoint. Once the user's access to both directories in the Endpoints is checked, my application lists all files and subfiles inside the source directory and populates a String[] with each file name considering it's canonical name, such as:
String[] sourceFiles = new String[]{ "/home/felipeleao88/losf/file1.txt", "/home/felipeleao88/losf/file2.txt", "/home/felipeleao88/losf/file3.txt", "/home/felipeleao88/losf/subdir/sub_file1.txt", "/home/felipeleao88/losf/subdir/sub_file2.txt", "/home/felipeleao88/losf/subdir/sub_file3.txt", "/home/felipeleao88/losf/subdir/anotherdir/sub_sub_file1.txt", "/home/felipeleao88/losf/subdir/anotherdir/sub_sub_file2.txt" };
Another String[] is populate to indicate the directory and names each file in sourceFile array should assume in the destination. This way I have two arrays as required by GridFTPClient's method extendedMultipleTransfer() (I'm using the complete method alternative, sending offsets and files sizes). Everything works like a charm when I send either a single file or a list of files, but when the such list is very big the extendedMultipleTransfer() method behaves in a weird way.
When sending an array describing more then a thousand files the method simply stops responding after a while, not throwing exceptions, neither finishing in a wrong way. It basically starts the transfer, transfer around a 100 files and hangs. I tried transferring 800 files, 850 files, 950 files and other random numbers and it works fine, all files are transferred, but when my list has over a thousand files it hangs.
Usually it transfers around 100 files before hanging, but this behaviour is not stable, since I also had random numbers of transferred files (such as 77, 96, 107, 104) before the method freezes. In my test I'm using generated bulk files of 2M each. I also already checked if the endpoints have free disk space
I used netstat to watch open tcp connections, and the application is correctly openning the amount of TCP connections I asked it to use (4 connections) they remaing active even after the method begins hanging and eventually I have to force the end of the transfer manually through command line.
I'm running CentOS 6.3 in both endpoints and the application itself runs over Fedora 20. My application uses the following jglobus maven dependencies:
My call to the extendedMultipleTransfer method: OBS: classes gridFtpClientSrc and gridFtpClientDst are instances of the GridFTPClient object.
//setting passive and active FTP modes gridFtpClientSrc.setPassive(); HostPort hp = gridFtpClientDst.setPassive(); gridFtpClientSrc.setActive(hp);
//Call to initialize the transfer gridFtpClientSrc.extendedMultipleTransfer( resumeOffsets, //long[] object with offsets resumeSizes, //source files lengths sourceFiles, //String[] with source files paths gridFtpClientDst, //GridFTPClient object for destination resumeOffsets, //long[] object with offsets destinationFiles, //String[] with destination files paths new MarkerListener() { public void markerArrived(Marker arg0) { // ommited simple control method } }, new MultipleTransferCompleteListener(){ public void transferComplete(MultipleTransferComplete mtc) { // ommited simple control method } } );