ZJONSSON / node-unzipper

node.js cross-platform unzip using streams
Other
424 stars 114 forks source link

Byte Range for File Within an Archive #294

Closed jaruba closed 2 months ago

jaruba commented 4 months ago

Hi, I've been looking at unzipper but I can't figure out if it has support for choosing byte range of the files within the zip archive.

My intention is to stream a file from within a remote zip file (URL) through a HTTP server that supports range requests.

The unzipper.Open.url() method seems to support byte range, but I'm unsure if this is useful in my case as that seems to be the byte range for the zip archive itself, not for file.stream().

Does the decompression always have to start from the beginning of the file or can we choose to decompress only part of the file too?

Thanks in advance

ZJONSSON commented 2 months ago

The unzipper open method exposes the central directory with the ability to stream individual files from the zip file using the bit-ranges defined in the directory.

I created a simple cli called webunzip that showcases this unzipper functionality in action:

npx webunzip https://dumps.wikimedia.org/other/poty/poty2007.zip

allows you to pick the files to extract from a 1.1gig wikipedia archive.

screenshot

jaruba commented 2 months ago

@ZJONSSON i feel like my question was not actually answered tbh, my intention here is to stream a video from inside a zip file on-the-fly and allow byte range requests for seeking this video (while it is still being unpacked), which i believe is not possible with node-unzipper?

ZJONSSON commented 2 months ago

There is no way to translate uncompressed byte ranges to zip byte ranges. I would suggest using an LRU cache or not using zip files

ZJONSSON commented 2 months ago

Unless the content-encoding is gzip, then this is possible. But, im not sure you can guarantee that all clients accept gzip encoding only

jaruba commented 2 months ago

@ZJONSSON while that makes sense and is what i expected, we also implemented RAR a/v streams and the advantage to supporting video seeking was that video files are already compressed, so they are not further compressed in the RAR archive, while I understand that ZIP acts differently, I would have expected a similar scenario for audio/video files when it comes to ZIP too, as when creating a ZIP file from a video file the size advantage seems minimal