archiverjs / node-archiver

a streaming interface for archive generation
https://www.archiverjs.com
MIT License
2.8k stars 220 forks source link

Dealing with readstream errors in append #321

Open technobrent opened 6 years ago

technobrent commented 6 years ago

The following code demonstrates my topic. I used /etc/gshadow on my Ubuntu system to produce a reliable access error.

const archiver = require("archiver")
const fs = require("fs")
const path = require("path")

const archive = archiver('zip')
const contentDir = '/etc'

archive
  .on('error', err => { console.error(err) })
  .on('warning', err => { console.error(err) })
  .pipe(fs.createWriteStream('/tmp/archivetest.zip'))

archive.append(fs.createReadStream(path.join(contentDir, 'crontab')), { name: 'crontab' })
archive.append(fs.createReadStream(path.join(contentDir, 'gshadow')), { name: 'gshadow' })
archive.append(fs.createReadStream(path.join(contentDir, 'host.conf')), { name: 'host.conf' })
archive.finalize()

For starters, this throws an exception and kills the script. It doesn't trigger the archiver's 'error' or 'warning' events.

events.js:183
      throw er; // Unhandled 'error' event
      ^

Error: EACCES: permission denied, open '/etc/gshadow'

I added a listener for 'error' on the gshadow readstream to prevent the script from bombing out, but then a corrupt zip archive is produced.

My workaround was to pipe the readstream to a passthrough, replacing the gshadow line above with this:

const pass = new stream.PassThrough()
fs.createReadStream(path.join(contentDir, 'gshadow'))
.on('error', err => {
    console.error(err)
    pass.end()
})
.pipe(pass)
archive.append(pass, { name: 'gshadow' })

That causes gshadow to be packed as an empty file and the zip file produced is good, but this doesn't seem ideal. For my real project use I'm zipping up thousands of files and I'm not sure what the impact of my workaround will be. I think it would be better for the archiver to handle these errors.

ctalkington commented 6 years ago

hum have you tried using the built in file helper? I believe it should catch the error internally. It also has lazy reading of streams for performance in turns of open files etc.

technobrent commented 6 years ago

No, I need to pass a stream so that I can transform the data before adding it to the archive. The above code was just to isolate the issue I was seeing. In reality, I will always be wrapping these readstreams, so I guess this doesn't affect my current use case and if I weren't doing transforms I could just use the file helper. But I would think that append should still catch stream errors and either shut down gracefully or at least isolate the stream so it doesn't corrupt the archive.

halcarleton commented 6 years ago

I also ran into this issue. I solved it by manually emitting an error from the archive stream if any of my appended streams emitted an error event.

something like

incomingStreams.forEach((stream) => {
  stream.on('error', (e) => {
    archiveStream.emit('error', e);
  });
});
ctalkington commented 6 years ago

you know, what point does the access error fire? I'm wondering if it's on creation of stream or the actual pipe. As we have code that listens and bubbles up errors but it has to be queued and appended before that happens.

technobrent commented 6 years ago

I'm not sure at what point the error is first raised, but I switched my code snippet around a little bit to investigate:

const archiver = require("archiver")
const fs = require("fs")
const path = require("path")

const archive = archiver('zip')
const contentDir = '/etc'

archive
  .on('error', err => { console.error(err) })
  .on('warning', err => { console.error(err) })
  .pipe(fs.createWriteStream('/tmp/archivetest.zip'))

const gshadow = fs.createReadStream(path.join(contentDir, 'gshadow'))

archive.append(fs.createReadStream(path.join(contentDir, 'crontab')), { name: 'crontab' })
archive.append(gshadow, { name: 'gshadow' })
console.log('appended gshadow stream')
archive.append(fs.createReadStream(path.join(contentDir, 'host.conf')), { name: 'host.conf' })
console.log('finished appends')
archive.finalize()
console.log('called finalize')

gshadow.on('error', err => {
  console.error('error event')
  console.error(err)
})

The output looks like this:

appended gshadow stream
finished appends
called finalize
error event
{ Error: EACCES: permission denied, open '/etc/gshadow'
  errno: -13,
  code: 'EACCES',
  syscall: 'open',
  path: '/etc/gshadow' }

So it looks like the append should have a chance to catch this error.

claytonrothschild commented 4 years ago

I'm experiencing this issue as well.

When I pass a request stream that will ultimately result in ECONNREFUSED, node-archiver silently fail, will not emit an error event, and crashes node.

Here's my code:

archive.append(request('http://127.0.0.1:1000'), { name: `myFile.png` });

I was able to capture the error using @halcarleton's technique above. I listen for error on the stream and then emit it to archive:

archive.append(request('http://127.0.0.1:1000').on('error',(e) => archive.emit('error',e)), { name: `myFile.png` });

Having the archive error listener console.log:

archive.on('error', function (err) {
      console.log(,err)
})

reveals this error:

Error: connect ECONNREFUSED 127.0.0.1:1000
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) {
  errno: 'ECONNREFUSED',
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 1000
}

Unfortunately, even after handling the error by passing it to archive, node still crashes. Any ideas here? Node crashing when a request is refused is really affecting my production application.

artonio commented 1 year ago

We had this happen to us when appending a stream from AWS S3, if the S3 did not have the requested file, then our server would crash. This may not be perfect, but this is what we ended up doing:

import Archiver from "archiver/lib/core";
import {Logger} from "@nestjs/common";
import Zip from "archiver/lib/plugins/zip";

/***
 * This class extends the base Archiver class in order to catch a KeyNotFound error from S3.
 */
export class RArchiver extends Archiver {
    private readonly logger = new Logger(RArchiver.name, { timestamp: true });

    constructor() {
        super();
        const zip = Zip({ store: true });
        this.setModule(zip);
    }

    append(input, data) {
        super.append(input, data);
    }

    pipe(response) {
        super.pipe(response);
    }

    on(event: string, eventFunc) {
        super.on(event, eventFunc);
    }

    setModule(zip: Zip) {
        super.setModule(zip);
    }

    finalize() {
        super.finalize();
    }

    // override to not emit error
    _moduleAppend(source, data, callback) {
        // @ts-ignore
        if (this._state.aborted) {
            callback();
            return;
        }

        // @ts-ignore
        this._module.append(source, data, function (err) {
            this._task = null;

            if (this._state.aborted) {
                this._shutdown();
                return;
            }

            // !! Handle error here !!
            if (err) {
                // Original handling code
                // this.emit('error', err);

                // Log error
                this.logger.error(err);
                setImmediate(callback);
                return;
            }

            /**
             * Fires when the entry's input has been processed and appended to the archive.
             *
             * @event Archiver#entry
             * @type {EntryData}
             */
            this.emit('entry', data);
            this._entriesProcessedCount++;

            if (data.stats && data.stats.size) {
                this._fsEntriesProcessedBytes += data.stats.size;
            }

            /**
             * @event Archiver#progress
             * @type {ProgressData}
             */
            this.emit('progress', {
                entries: {
                    total: this._entriesCount,
                    processed: this._entriesProcessedCount
                },
                fs: {
                    totalBytes: this._fsEntriesTotalBytes,
                    processedBytes: this._fsEntriesProcessedBytes
                }
            });

            setImmediate(callback);
        }.bind(this));
    }
}