mscdex / connect-busboy

Connect middleware for busboy
MIT License
155 stars 25 forks source link

How to properly terminate an upload and clean up streams/handles? #29

Closed gkostov closed 2 hours ago

gkostov commented 4 hours ago

Hi,

I believe I am in a similar situation to https://github.com/mscdex/busboy/issues/366 - in my case I need to abort the request because the client has gone away (connection broken, etc.). I am piping the file data as per the examples like:

file.pipe(fs.createWriteStream(localFileName));

and when the client stops sending data the connection simply remains hanging around forever (both the socket and the output file handles, and uploaded file data on the disk remain, piling up to thousands with time until the app is restarted).

I'm using node@20 and Express@4 and there does not appear to be any default timeouts on the server's requests. So I tried setting a timeout on the server (which each request properly inherited) but, as per the documentation, what happens is:

When an idle timeout is triggered the socket will receive a 'timeout' event but the connection will not be severed. The user must manually call socket.end() or socket.destroy() to end the connection.

So other than a "timeout" event firing on the socket there is nothing else happening and the connection and file handles remain.

To be able to monitor what is happening, I extended the example's code to this:

let outputFileStream = fs.createWriteStream(localFileName);
console.log('outputFileStream created');
outputFileStream.on('close', () => console.log('outputFileStream closed'));
outputFileStream.on('error', () => console.log('outputFileStream errored'));
outputFileStream.on('finish', () => console.log('outputFileStream finished'));
file.on('data', () => console.log('file data comming in'));
file.on('close', () => console.log('file closed'));
file.on('end', () => console.log('file ended'));
file.on('error', err => {console.log('destroying outputFileStream'); outputFileStream.destroy(err); } );
// it makes no difference whether the timeout is set on the server or on the request here, so I set it here for clarity
req.setTimeout(1000, () => {
    console.log('request timed out now!!!!');
    req.destroy();
});
file.pipe(outputFileStream);

I was hoping that req.destroy(); will kill the request which, as you've suggested in https://github.com/mscdex/busboy/issues/366, will make it "stop writing to the busboy stream". The client browser sees the connection as terminated but busboy does not and nothing more happens after the console prints:

outputFileStream created file data coming in file data coming in request timed out now!!!!

Now I thought that maybe busboy isn't notified of the connection termination so I need to manually tell it to stop and changed this code to:

...
req.setTimeout(1000, () => {
    console.log('request timed out now!!!!');
    req.destroy();
    req.busboy.end();  // this is line 58 noted in the stack trace below
});
...

Which appears to be closing the busboy stream and my output stream, but it also crashes the application with these logs:

outputFileStream created file data coming in file data coming in request timed out now!!!! destroying outputFileStream file closed Error: Unexpected end of form at Multipart._final (/var/www/app/node_modules/busboy/lib/types/multipart.js:588:17) at prefinish (node:internal/streams/writable:914:14) at finishMaybe (node:internal/streams/writable:928:5) at Writable.end (node:internal/streams/writable:843:5) at IncomingMessage. (/var/www/app/server/modules/file.js:58:19) at IncomingMessage.emit (node:events:519:28) at Socket.socketOnTimeout (node:_http_server:778:50) at Socket.emit (node:events:519:28) at Socket._onTimeout (node:net:591:8) at listOnTimeout (node:internal/timers:581:17)

And now I'm out of ideas for how to tell busboy code to stop processing without crashing the application? Could you help with this, please?

Thanks a lot

mscdex commented 2 hours ago

how to tell busboy code to stop processing without crashing the application

Stop writing data to the busboy instance or add an 'error' event handler to the busboy instance.

gkostov commented 29 minutes ago

Adding the error event handler seems to prevent the exception from crashing the application. Thanks for that.

Though, I still don't know exactly what you mean by "Stop writing data to the busboy instance". When the client disappears then there is no more data written to the instance but busboy is still there and waiting for more. Given that busboy is piping from the Request object I thought that destroying the request (with req.destroy();) would make it obvious that busboy has nothing more to work on, but nope - busboy is still there and waiting for more. Only after req.busboy.destroy(); it will close down the file stream and itself (I guess its own stream gets GC'ed, but I haven't checked that). So my final code became:

let outputFileStream = fs.createWriteStream(localFileName);
file.on('error', err => outputFileStream.destroy(err) );
req.setTimeout(1000, () => {
    req.destroy(); // terminate the request so the socket is freed
    req.busboy
        .on('error', () => {})  // this is being set just to prevent the exception from crashing the application
        .destroy();
});
file.pipe(outputFileStream);

Do you think it is worth mentioning this caveat in the docs? I've seen a lot of examples doing what the docs say and not considering how file and socket handles will leak along with the disk space for broken uploads? I can setup a PR if you think it can be of any benefit for others.

Cheers