fast-data-transfer / fdt

FDT is an Application for Efficient Data Transfers which is capable of reading and writing at disk speed over wide area networks (with standard TCP). It is written in Java, runs an all major platforms and it is easy to use. FDT is based on an asynchronous, flexible multithreaded system and is using the capabilities of the Java NIO libraries.
https://fast-data-transfer.github.io/
Apache License 2.0
200 stars 45 forks source link

Is there a bug with the transfer port ? #57

Open Elyasin opened 3 years ago

Elyasin commented 3 years ago

I fail to pull a second time data from the server.

On the client side I see: java.lang.Exception: java.lang.IllegalArgumentException: port out of range:-1

On the server side I see: 2021-05-26 03:42:27 INFO lia.util.net.copy.transport.ControlChannel sendMsgImpl [ ControlChannel ] sending message tag ( 15 ): REMOTE_TRANSFER_PORT msg: -1

This happens after successfully pulling recursively a directory. The first pull works fine. The second does not.

This is the command I use for the first pull on the client side: java -jar fdt.jar -pull -r -c <ip-address> -d <dest-dir> <remote-dir> -bs 8M -P 8 -bio

The command I use on the server side: java -jar fdt.jar -f <filtered-ip-address> -bio -bs 8M

mshedsilegx commented 3 years ago

I noticed the same issue. In my case, the transfer ports are never released by the server, even if the client disconnects gracefully. After many transfers and the server being up for a long period of period, we are running out of ports and the issue above shows. Due to firewall policies, we have to allocate transfer ports on the server side via: -tp, and it looks like we can only allocate a limit of 10. So after 10 transfers, we are getting the error above. The only way to recover is to bounce the java server process.

Harsh6k3 commented 2 years ago

Support email : support-fdt@monalisa.cern.ch does not work Tried the following cases: Case: server started, client 1 started, initial autosys job states were terminated/failed Error in clients log: java.lang.IllegalArgumentException: port out of range:-1

Error in server log: 2021-12-15 08:46:04 INFO lia.util.net.copy.FDTSession handleGetRemoteTransferPortMessage [ FDTSession ] [ handleGetRemoteTransferPortMessage ( rtp ) 2021-12-15 08:46:04 WARNING lia.util.net.copy.FDTSession handleGetRemoteTransferPortMessage Exception while handling 'get remote transfer port' message java.net.BindException: Address already in use: bind

Output: only one file transferred second file not transferred cannot invoke other client either

Case: after autosys failed in step above , restarted same job after 30 min

Error in client log: 2021-12-15 09:19:07 INFO lia.util.net.copy.transport.ControlChannel sendMsgImpl [ ControlChannel ] sending message tag ( 15 ): REMOTE_TRANSFER_PORT msg: rtp 2021-12-15 09:19:08 INFO lia.util.net.common.Utils getFDTTransferPort Got transfer port: ldhetl1-dev.jpmchase.net:-1 2021-12-15 09:19:08 WARNING lia.util.net.copy.FDTSessionManager addFDTClientSession Got exception initiation Session/RemoteConn java.lang.Exception: java.lang.IllegalArgumentException: port out of range:-1

Error in server log: WARNING lia.util.net.copy.FDTSession handleGetRemoteTransferPortMessage There are no free transfer ports at this moment, please try again later

mshedsilegx commented 2 years ago

It is my understanding that only 10 transfer ports are allowed. In our case, those are never released after a transfer, hence we end up running out of ports, and get the same error: port out of range:-1 I have not heard any solution yet from the development team to resolve. Restarting the server process releases those ports, and we are able to tranfer again from the client 10 more times. Not a sustainable solution for sure.