mafintosh / tar-stream

tar-stream is a streaming tar parser and generator.
MIT License
406 stars 92 forks source link

Extract multiple files at the same time? #111

Closed sbrl closed 4 years ago

sbrl commented 4 years ago

I'm processing a tar file with a number of files in it, and I'm considering parallelising my program to make it run faster - as I have to do some heavy CPU-bound work on files in the tar archives. Is it possible to call the callback function when extracting a file from a tar archive before I've finished reading a given file's stream? Ideally I'd like to be able to read multiple files from the tar archive at the same time - without having to wait until I've completely finished reading one.

mafintosh commented 4 years ago

Unfortunately not. It's a stream so it reads from start to end, once.

What you can do upstream is buffer the file content before flushing it to disk to write multiple files at once (ie parallise the file i/o). Most likely the tar streaming itself is not your bottleneck.

sbrl commented 4 years ago

Ah, I see. Thanks for the explanation!