Open solarflare045 opened 10 years ago
I'm not sure I understand what you mean by "automatically cleaned up." The raw tcp socket is passed to the get() callback and the only time that socket is destroyed is on data connection timeout.
Yeah, that appears to be fine. However, the Readable stream that the .get() callback provides is immediately destroyed when the stream from the FTP server to the library ends. This is noticeable as stream.destroyed equals True at this point.
var ftp = require('ftp');
var client = new ftp();
var fs = require('fs');
client.on('ready', function() {
client.get('/www/favicon.ico', function(err, stream) {
if (err) {
console.log('ERROR!');
return;
}
console.log('Stream Destroyed? ' + stream.destroyed);
stream.once('close', function() {
console.log('Stream Destroyed? ' + stream.destroyed);
client.end();
});
stream.pipe(fs.createWriteStream('favicon.ico'));
// Simulate flow-control
setTimeout(function() {
stream.pause();
setTimeout(function() {
stream.resume();
}, 3000);
}, 10);
});
});
client.connect(/* Host, User and Password */);
I guess the expected flow would be that you would push last packet to the Readable stream, then the stream would emit the 'data' event, then you would clean up the stream. However, if the stream is paused at that point (say due to .pipe()'s own internal flow control), all the data after the pause is lost.
Edit: Just to clairfy. The above example has issues if the FTP stream finished (aka, the file had been fully received) in between the .pause() and the .resume().
What node version are you using?
node --version
v0.10.22
Ok so if I understand correctly, the complaint is that when you end the control connection, the data connection is immediately severed as well? If that's the case, I don't think there is much I can do about that because that's the server doing that.
Perhaps you should client.end();
on the local writable file stream's 'finish' event instead of the source socket's 'close' event?
I have tried not having that event at all ... it was still a problem.
What did solve my problem was using:
stream.on('data', function(chunk) {
res.write(chunk);
});
Instead of:
stream.pipe(res);
(Where stream was from your libary, and res is a http.ServerResponse.)
And complaint is more when the data connection completes, the stream object you provided is severed. Meaning if the stream object was paused at that point in time, you lose the rest of the data. If it wasn't paused, there is no issue.
I don't know then. That's a raw socket you're working with and node-ftp (as far as I can tell) isn't meddling with it in any way that would cause something like that. :-\
I have the same problem.
It happens most often on very small files over very fast networks. May be a node issue, but I can't use pipe with node-ftp because of it. Happens almost every time on fast production hardware.
@polidore I think is right. I have the same issue. I'm connecting to a server and downloading a large number of files, (over 4,000). Besides the EMFILE errors that pop-up if I don't use fs-graceful, I notice that many of the files come down empty. My guess is that the pipe is not really pipping all the data. The conditions in my situation are similar to what @polidore described, small files over a fast network (and many of them). This is probably an underlying issue with node.
I've wasted a week of my life on this! This does definitely seem to be a Node issue relating to 0.10.x. I've tried to replicate the bug with Node 0.11.13 and it seems fine. I think it was fixed with this release of Node: http://blog.nodejs.org/2013/05/13/node-v0-11-2-unstable/.
Edit: Ignore me. Just retried it a few more times and the same issue is happening.
I am not sure the problem I was having is the same you guys were. Thing is stream was being closed at some random point before being able to pipe it somewhere else. It was even more weird then this, since after the piping I would get an error. After a few tests I found out this stops happening if I process only one file at a time. I simply create a bunch of tasks for all files I want to process this time and call them using async.series. Hope this helps someone! =)
doesn’t that mean that you are opening a new connection for each file? Have you found that to be a little slow??
Carl Furst
From: Marcel Ferreira Batista notifications@github.com<mailto:notifications@github.com> Reply-To: mscdex/node-ftp reply@reply.github.com<mailto:reply@reply.github.com> Date: Tuesday, December 23, 2014 at 2:13 PM To: mscdex/node-ftp node-ftp@noreply.github.com<mailto:node-ftp@noreply.github.com> Cc: Carl Furst Carl.Furst@mlb.com<mailto:Carl.Furst@mlb.com> Subject: Re: [node-ftp] .Pipe's Flow Control fails on slow Writeable Stream (#70)
I am not sure the problem I was having is the same you guys were. Thing is stream was being closed at some random point before being able to pipe it somewhere else. It was even more weird then this, since after the piping I would get an error. After a few tests I found out this stops happening if I process only one file at a time. I simply create a bunch of tasks for all files I want to process this time and call them using async.series. Hope this helps someone! =)
— Reply to this email directly or view it on GitHubhttps://github.com/mscdex/node-ftp/issues/70#issuecomment-67987397.
MLB.com: Where Baseball is Always On
Sorry I don't think I understand. What I do is to list all the directories that interest me, and find the files that I need. I basically store all the paths to the files. When I am done with this, I create a task for each path and then process it in a serial way. Something like
var tasks = [];
filePaths.forEach(function(filePath){
(function(filePath){
var task = function(done){
ftp.get(filePath, function(error, stream){
stream.pipe(process.stdout);
stream.on("close", function(){
done();
})
})
}
tasks.push(task);
})(filePath);
});
async.series(tasks, function(){
console.log("Done processing all files")
})
So yes, I make a GET for each path but using the same FTP client. Did I answer your question?
Yes, thanks! Carl Furst
From: Marcel Ferreira Batista notifications@github.com<mailto:notifications@github.com> Reply-To: mscdex/node-ftp reply@reply.github.com<mailto:reply@reply.github.com> Date: Tuesday, December 23, 2014 at 2:26 PM To: mscdex/node-ftp node-ftp@noreply.github.com<mailto:node-ftp@noreply.github.com> Cc: Carl Furst Carl.Furst@mlb.com<mailto:Carl.Furst@mlb.com> Subject: Re: [node-ftp] .Pipe's Flow Control fails on slow Writeable Stream (#70)
Sorry I don't think I understand. What I do is to list all the directories that interest me, and find the files that I need. I basically store all the paths to the files. When I am done with this, I create a task for each path and then process it in a serial way. Something like
var tasks = [];
filePaths.forEach(function(filePath){ (function(filePath){ var task = function(done){ ftp.get(filePath, function(error, stream){
stream.pipe(process.stdout);
stream.on("close", function(){
done();
})
})
}
tasks.push(task);
})();
});
async.series(tasks, function(){ console.log("Done processing all files") })
So yes, I make a GET for each path but using the same FTP client. Did I answer your question?
— Reply to this email directly or view it on GitHubhttps://github.com/mscdex/node-ftp/issues/70#issuecomment-67988673.
MLB.com: Where Baseball is Always On
I see a similar behaviour : I call stream.pause() immediately after a get and still receive the "end" and "close" events for the stream without using pipe or on('data'). According to nodejs docs "the end event will not fire unless the data is completely consumed": https://nodejs.org/docs/v0.10.37/api/stream.html#stream_event_close
However with node-ftp, the stream sends "end" while no data was consumed yet.
looking at the code, resume() is called after the callback https://github.com/mscdex/node-ftp/blob/master/lib/connection.js#L618, so my pause is cancelled because it happens before resume() is called in the node-ftp lib. That's very confusing, a way around this is to call to pause() within a setTimeout. I wonder why resume() is called after the callback at https://github.com/mscdex/node-ftp/blob/master/lib/connection.js#L618 ?
Calling pause with setTimeout allows to pause the stream, but the end and close event still arrive before any data is consumed.
The end event issue is due to node-ftp, if I comment lines: https://github.com/mscdex/node-ftp/blob/master/lib/connection.js#L555-L569 It behaves as expected. With that code, the end event is never sent because of the return statement outside of the if block. As a result the stream consumers never receive end and hang.
NW 0.12.3 It looks like related bug.
readStreamFromFTP.pipe(decryptor).pipe(writeStreamToHDD);
Error: error:0606506D:digital envelope routines:EVP_DecryptFinal_ex:wrong final block length
Read stream finish and close 'decryptor' (lib crypto, AES decryptor), before it read all data from FTP, so it call 'decryptor.final()' on 'wrong length data' and throw Error.
AFTER I PUT 'decryptor' IN 'data' AND 'close' EVENTS OF FTP STREAM IT WORKED FINE. Bug occurs on fast machines with files size over 1 MB. I ran it in loop to get error after 1-25 transfers (1MB file).
Maybe a queue layer using async?? put the files on a queue (like FileZilla) and throttle it. I mean file I/O is so damn fast, forcing node to be a bit more synchronous wouldn't hurt.
Perhaps related to https://github.com/mscdex/node-ftp/issues/66 ? There's a workaround there which may help perhaps?
What worked for me is the:
ftpClient.get("fileNameOnFtp", function (err, stream) {
if (!!err) {
var errMessage = 'downloading the file: "' + metadata.filename + '" is failed. Error message:'+ err.toString();
error(errMessage);
} else {
var localDownloadedFileName = "localFileName";
var writeStream = fs.createWriteStream(localDownloadedFileName);
stream.on('data', function(chunk) {
writeStream.write(chunk);
});
stream.on('end', function () {
writeStream.close();
callback();
});
}
});
Same problem for me, anyone solved ?
It appears the readable stream is automatically cleaned up when the library has finished receiving a file. This can be a problem when using .pipe when piped to a slower stream (such a a HTTP response.)
The best way to replicate this is to download a file over FTP, and pipe to an HTTP response where the HTTP connection is considerably slower than the FTP connection.