mafintosh / tar-stream

tar-stream is a streaming tar parser and generator.
MIT License
400 stars 93 forks source link

Uncatchable error in pack.entry #165

Open MorningLightMountain713 opened 5 months ago

MorningLightMountain713 commented 5 months ago

Hi there, thanks for this great package.

I have an issue where I'm using tar-fs, and by extension, tar-stream

The issue I have is an edge case, where if an end user forgets to add -o filename to the end of the curl command, it crashes my node app, and I'm unable to catch the error. (I'm using express and streaming a response)

It stems from this:

  entry (header, buffer, callback) {
    if (this._finalized || this.destroying) throw new Error('already finalized or destroyed')

and in tar-fs it doesn't catch the error. As this is asynchronous, when I try catch the error, it hasn't happened yet.

    if (stat.isDirectory()) {
      header.size = 0
      header.type = 'directory'
      header = map(header) || header
      return pack.entry(header, onnextentry)
    }

How can I fix this? Thanks.

mafintosh commented 5 months ago

tar-fs should not call entry when that condition is true so its a bug there, have any easy repro, then i'll fix.

MorningLightMountain713 commented 5 months ago

I assume the amount of files / size of files would also determine when this bug is hit. For different combinations of files and directories, some of them would error, some wouldn't.

To reproduce:

Here is the file system I was using:

davew@chud:~/test_files$ ls
a  b  c
davew@chud:~/test_files$ du -h
1.1G    ./c
1.1G    ./a
1.1G    ./b
3.2G    .
davew@chud:~/test_files/$ find . | wc -l
1540

Files created in each a b c directory with:

for n in {1..512}; do dd if=/dev/urandom of=file$( printf %03d "$n" ) bs=4248 count=512; done

Here is the MRE:

const tar = require('tar-fs');
const { pipeline } = require('node:stream/promises');
const express = require('express');

const app = express();

async function streamData(_, res) {
  const tarStream = tar.pack("test_files");

  const work = pipeline(tarStream, res);

  try {
    await work;
  } catch (err) {
    console.log("Stream error:", err.code);
  }
}

app.post("*", streamData);

app.listen(33333, () => {
  console.log(`Server listening on port: ${33333}`);
});

Here is the call (made from another machine) obviously, this would work if I added -o filename:

> curl -vvv -H 'Content-Type: application/json' -X POST http://172.16.32.12:33333
*   Trying 172.16.32.12:33333...
* Connected to 172.16.32.12 (172.16.32.12) port 33333
> POST / HTTP/1.1
> Host: 172.16.32.12:33333
> User-Agent: curl/8.4.0
> Accept: */*
> Content-Type: application/json
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Date: Wed, 07 Feb 2024 14:18:51 GMT
< Connection: keep-alive
< Keep-Alive: timeout=5
< Transfer-Encoding: chunked
<
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
* Failure writing output to destination
* Failed reading the chunked-encoded stream
* Closing connection

And the result:

davew@chud:~/test_files$ node test.js
Server listening on port: 33333
Stream error: ERR_STREAM_PREMATURE_CLOSE
/home/davew/test_get_files/node_modules/tar-stream/pack.js:138
    if (this._finalized || this.destroying) throw new Error('already finalized or destroyed')
                                            ^

Error: already finalized or destroyed
    at Pack.entry (/home/davew/test_get_files/node_modules/tar-stream/pack.js:138:51)
    at onstat (/home/davew/test_get_files/node_modules/tar-fs/index.js:70:19)
    at /home/davew/test_get_files/node_modules/tar-fs/index.js:353:9
    at FSReqCallback.oncomplete (node:fs:189:23)

Node.js v20.9.0
davew@chud:~/test_files$

If I delete one of the folders (or remove some files), it works fine:

Server listening on port: 33333
Stream error: ERR_STREAM_PREMATURE_CLOSE

Thanks

MorningLightMountain713 commented 5 months ago

I have fixed this in my test environment with:

  function onstat (err, filename, stat) {
    if (pack.destroyed) return

Whether this is the right way to fix it or not ¯\(ツ)

Of note, I did go down the rabbit hole of https://github.com/nodejs/node/issues/7629 where I see you were involved ha