theophilusx / ssh2-sftp-client

a client for SSH2 SFTP
Apache License 2.0
797 stars 195 forks source link

Sometimes it stuck and does not execute full function/code and even does not throw any exception just stuck at some point. #540

Open STDigitalIN opened 1 week ago

STDigitalIN commented 1 week ago

error: CLIENT[sftp]: connect: Debugging turned on {"timestamp":"2024-06-21T17:25:55.050Z"} error: CLIENT[sftp]: ssh2-sftp-client Version: 9.0.4 { "node": "18.16.1", "acorn": "8.8.2", "ada": "1.0.4", "ares": "1.19.1", "brotli": "1.0.9", "cldr": "42.0", "icu": "72.1", "llhttp": "6.0.11", "modules": "108", "napi": "8", "nghttp2": "1.52.0", "nghttp3": "0.7.0", "ngtcp2": "0.8.1", "openssl": "3.0.9+quic", "simdutf": "3.2.2", "tz": "2022g", "undici": "5.21.0", "unicode": "15.0", "uv": "1.44.2", "uvwasi": "0.0.15", "v8": "10.2.154.26-node.26", "zlib": "1.2.13" } {"timestamp":"2024-06-21T17:25:55.050Z"} error: CLIENT[sftp]: connect: Connect attempt 1 {"timestamp":"2024-06-21T17:25:55.050Z"} error: CLIENT[sftp]: getConnection Unexpected end event - ignoring {"timestamp":"2024-06-21T17:25:55.475Z"} error: CLIENT[sftp]: connect endListener - ignoring handled error {"timestamp":"2024-06-21T17:25:55.475Z"} error: CLIENT[sftp]: Global end event: Handling unexpected event {"timestamp":"2024-06-21T17:25:55.476Z"} error: CLIENT[sftp]: getConnection closeListener - ignoring handled error {"timestamp":"2024-06-21T17:25:55.476Z"} error: CLIENT[sftp]: connect closeListener - ignoring handled error {"timestamp":"2024-06-21T17:25:55.476Z"} error: CLIENT[sftp]: Global close event: Handling unexpected event {"timestamp":"2024-06-21T17:25:55.476Z"}

after this it does not logged anything and didn't thrown any error/exception and my job worker get stuck because the job is not completed.

using it with express and node v18.16.1

theophilusx commented 1 week ago

It looks like your using version 9.0.4. Please verify your running the latest version i.e. 10.0.3.

The provided log indicates your remote sftp server is closing the connection. Ensure your updated to version 10.0.3 and if the issue continues, create a test script which is able to reproduce the problem and post it. I cannot do anything if I cannot reproduce the issue.

pjbrigden commented 6 days ago

This exact issue is happening to me too. I have updated to version 10.0.3 and it has not been resolved. @STDigitalIN are you also trying to connect to an Azure SFTP server?

devgk882 commented 1 day ago

was this issue solved? I think I am also having the same issue where sftp.get() wont throw error sometimes and it hangs this mostly happens for large size file

theophilusx commented 15 hours ago

As yet, nobody has provided sufficient information to reproduce or even confirm there is an issue. My suspicion is that this will either be a symptom of network problems or incomplete/poor sftp server implementation. All similar issues reported in the last 5+ years have been found to be network or server related issues. Therefore, on the balance of probabilities, it would likely be best to eliminate possible network and server issues first.

In the past, there habe been multiple similar issues which turned out to be due to network stack issues associated with either docker or WSL. Testing where neither of these technologies are involved would be worthwhile.

There habe also been prebious instances of problems when using sftp serbers running on cloud platforms. In partuclar, azure and lamda. Again, testing with servers running on different platforms might be useful. If the issue can only be reproduced on, for example, azure, then it is likely either there is an azure network issue or the sftp server implementation has issues.

Finally, the next most common cause of problems are just simple bugs in client scripts. A very common error is to use array comprehension methods like map, foreEach, reduce etc. These array methods are NOT async safe and won't work correctly with async/awsait without taking additonal precaustions. A simple verification is to just replace any such looks with standard for looks and re-test. The other common error is to miss a return statement in a function, so instead of return the promise object, the function returns instantly and the script ends up calling end before the get method hasx completed.

If, after doing all of the above, the issue still occurs, the next step would be to create the smallest, simplest script whch is able to generate the issue and try to narrow down to the specific set of criteria necessary for us to reliably reproducew the issue. Until we can repoduce the issue, not only can we not diagnose it, we won't be able to confirm any proposed fix for it. In other words, we must be able to first reproduce this issue in order to fix it.

So far, I habe tested against local and remote Linux servers (fedora, debian and arch) using IPv4 and IPv6 with both small < 1Mb and large >1Gb and with concurrent and single get calls with no issues using both promise chains and async/await with no errors. This has been with the destination argument being a writeable stream (either a file or procvess.stdout).