Closed adamtal3 closed 7 years ago
I've managed to get a test user for a different server.
Now I don't get a Malformed packet
error but rather a wrong result when uploading / downloading non-small files (it keeps reproducing when I'm trying to upload a 29mb file and get a ~28mb file on the server, the resulting file size is different on each try).
This is the upload code I'm running for upload:
var progressCallback = function (totalFile, totalTransferred) {
log('progress: ' + totalTransferred + '/' + totalFile + ' ' + parseInt((totalTransferred / totalFile) * 100) + '%');
};
var total = 0;
sftp.fastPut(localPath, remotePath, {
step: function (totalTransferred, chunk, totalFile) {
total = totalFile;
progressCallback(totalFile, totalTransferred);
}
}, function (err) {
if (err) {
onError(err);
} else {
progressCallback(total, total);
uploadCompleatedCallback();
}
});
And this is the code for download:
sftp.fastGet(source, target, {
step: function (totalTransferred, chunk, totalFile) {
total = totalFile;
progressCallback(totalFile, totalTransferred);
}
}, function (err, result) {
if (err) {
onError('Download failed. ' + JSON.stringify(err));
} else {
progressCallback(total, total);
downloadCompletedCallback();
}
});
When uploading, this is the log:
2016-10-18T15:20:49.032Z - debug: progress: 32768/30839489 0%
... more progress logs
2016-10-18T15:31:38.503Z - debug: progress: 30741185/30839489 99%
2016-10-18T15:31:38.549Z - debug: progress: 30773953/30839489 99%
2016-10-18T15:31:39.288Z - debug: progress: 30806721/30839489 99%
2016-10-18T15:31:39.867Z - debug: progress: 30839489/30839489 100%
2016-10-18T15:31:40.194Z - debug: progress: 30839489/30839489 100%
And this is the log for download:
2016-10-18T15:32:06.973Z - debug: progress: 32768/29491200 0%
2016-10-18T15:32:07.373Z - debug: progress: 65536/29491200 0%
... more progress logs
2016-10-18T15:33:17.268Z - debug: progress: 8093696/29491200 27%
2016-10-18T15:33:17.317Z - debug: progress: 8126464/29491200 27%
2016-10-18T15:33:17.790Z - debug: progress: 8159232/29491200 27%
2016-10-18T15:33:18.238Z - debug: error: {"code":"ECONNRESET","errno":"ECONNRESET","syscall":"read","level":"client-socket"}
As you can see the downloaded file is only partial... but the thing is that even though the step function indicates the entire file was uploaded, looking at the server the file is 28.13MB instead of 29.4MB. This keeps reproducing (the resulting sizes change a little).
I've tried the same process using an FTP software and it worked (seems the server is ok).
This is a detailed upload log of an upload interupted with conncetion reset: https://gist.githubusercontent.com/adamtal3/678129cc456e937fa8e855a31cd0895d/raw/7601b942d4db89ae1dc1460e47e4ca4dd5634535/upload-connection-reset
And this is an upload log of a file that's uploaded completely but when looking at it on the server its size is 29491200 and not 30839489 as the log indicates (how could it be?): https://gist.githubusercontent.com/adamtal3/678129cc456e937fa8e855a31cd0895d/raw/7601b942d4db89ae1dc1460e47e4ca4dd5634535/upload-complete-wrong-size
Can you understand anything from this? How can the resulting file be smaller that the size stated in the step function? Why does it work with a 3rd party sftp software but not with ssh2? Is there a client cause of the connection reset? Is there anything to do about it? can I reconnect and continue the download/upload from where it disconnected?
I can reproduce it with the sftp server you originally linked to. I am looking into it.
Ok, it seems the problem is with the server implementation. They are sending 13 more bytes (32781 total) than the max packet size (32768) we requested for the sftp channel.
EDIT: I've sent Rebex an email about it.
Thanks for that!
The issues in my last comment are not related to the rebex server. Can you understand something from the logs I've attached? Can you give me a lead of how I can find a solution for the problem?
@adamtal3 Right, I was working on a separate reply to your other scenario: :-)
Regarding connection reset, it looks sudden, so there is no apparent reason for the connection to be interrupted. Perhaps server-side logging will help in that case?
I've also tried to duplicate the problems with the CrushFTP server you've tried, but downloading worked for me every time (with the same 90MB file used for testing with the first (Rebex) server). I tried uploading a file and while the uploaded file size was correct, it seems CrushFTP isn't writing the file correctly when there is more than one outstanding write at any given time. It writes some chunks to the file as all zeros. My guess is that they also have a bug, since I have verified ssh2
is sending the correct file offsets and data in each chunk.
I have also sent CrushFTP support an email about it.
I really appreciate how helpful you are.
If I'm bound to use the CrushFTP with no control of the server, what can I do? Why can I use a software like FlashFXP which uploads with no problem but having problems running it with ssh2? Should using put instead of fastPut work? And if so, how can I have a progress indication of that process?
Thanks again
For CrushFTP servers you could limit the concurrency to 1 (see the config options that can be passed to fastPut()
) until the problem is resolved.
It's hard to say about FlashFXP. Are you sure it's using SFTP and not FTP/FTPS? I could not find any information on how it transfers files via SFTP, maybe it performs at most one write at any given time for a file? It's hard to say with closed source products :-)
Like I said, calling fastPut()
with a concurrency of 1 should always work (this is equivalent to using sftp.createWriteStream()
though). I am not sure what you mean by using 'put' instead of 'fastPut'.
Using it with concurrency: 1
solves the issue.
About FlashFXP - I'm using it using SFTP over SSH
.
For some reason I thought you have a put
method which is the same as fastPut
but without the concurrency.. looking at the docs I don't see it, so I must have confused with something else.
I guess I'll stick to the concurrency: 1
(might consider letting the user change it) to be sure things always work.
You may close the issue if you think there's nothing further to do with it.
Big thanks for the huge help!
I have 1 final question that remained unanswered - can I continue from where I've stopped (on upload/download process) after getting a connection reset? Or do I have to start a new upload? (might be a problem with a very big file and unstable connection)
@adamtal3 You probably could continue where the transfer failed, although you'd have to do that manually since there is no automatic resume support.
I will give the developers of the two SFTP server products some time to respond before closing this issue.
Can you elaborate with a few words on do that manually
? Is there a way to seek the stream to where it stopped? Or is it impossible to keep writing to the same file after the connection is lost?
What I mean is you will have to duplicate the code for .fastPut()
/.fastGet()
and tweak it to allow for resuming at a particular offset. There is no seeking per se since sftp uses absolute offsets ("write this chunk of data at this exact offset in the file").
FWIW I would be open to a PR to ssh2-streams
to add such a resume feature if you're up for it.
I've received a reply from Rebex, they have acknowledged the bug (they were using a fixed max packet size) and they are working on a fix. I will keep you posted when the fix is available FWIW.
Great.
I'll try to take a look at ssh2-streams
's code hoping it's not a huge hassle adding a resume functionality.
I also got a response from CrushFTP. It looks like they have a design limitation since they allow various storage backends (not just local disk) and according to the CrushFTP support, performing "out of order" writes to some of those storage backends can be "expensive." They said they are working on a potential workaround involving a user-configurable buffer of sorts to help solve the problem.
Update: the CrushFTP developer(s) have updated their software to include a new setting that allows for buffering writes in memory before writing to disk, which seems to "fix" the issue, but it's certainly not foolproof. However I think it should work fine in the majority of cases. The setting is in prefs.xml and is a boolean called sftp_buffered_write
, which defaults to false
. Set it to true
to enable the fix. They also told me that this setting will default to true
in version 8 of their product and it will not be configurable there.
The underlying issue turned out to be that for whatever reason, out of order writes were not working correctly. AFAIK Windows shouldn't have a problem with that, it will just literally fill in the gap with zeros. However, this can be avoided if you tell Windows it's a sparse file, but the developer(s) said their app is coded in Java and may not have a way to do this from Java. They also said that most of their users are uploading to remote sites and not to a local disk, so setting the sparse file mode wouldn't help much. :shrug:
Started testing the option of reconnecting after a disconnect, and found something interesting.
Instead of using fastPut
with concurrency: 1
I'm using createWriteStream
and I'm wrapping the read stream with progress-stream
in order to get progress reports. Not only that I can reconnect after a disconnect (using the start
option) - I didn't get a single connection reset in all my tests so far (it happened a lot using fastPut
).
Just in case someone else needs this information.
And thanks again for your help!
Yes, those two are more or less equivalent when concurrency: 1
. For most servers that support greater concurrency though you will definitely want to use the fast*()
methods instead of create*Stream()
.
To give this some closure, I have received an update from the Rebex developer(s) saying that they have released versions of their components and servers with their bug fix.
I can
fastGet
a 100kb file but when I'm trying to do so with 90mb file I get theMalformed packet
error.I can't seem to find the cause of it.
I'm using this test sftp server (maybe that's the cause of the problem?).
This is the debug log from the connection: