guzba / zippy

Pure Nim implementation of deflate, zlib, gzip and zip.
MIT License
246 stars 29 forks source link

zip ENH request add/extract single file [streams] #12

Open brentp opened 3 years ago

brentp commented 3 years ago

Thanks for the library. From what I can see, it supports only extracting all or adding all to a zip archive. I have potentially huge archives and want to be able to add a file at a time (and write immediately flush to disk) when writing

and also to extract a single file from the archive, into memory (without writing the uncompressed to disk) when reading.

Maybe it could be something like:

let archive = ZipArchive("/path/to/my.zip", fmRead)
var buffer = newString(777)
# or
# var buffer = newSeq[int32](2048)
archive.extractInto("internal/zip/path.bin", buffer) # read into buffer[0].pointer

and for writing it could just support adding a single file or Stream at a time along with the compression level (if any).

Would this be something you'd consider supporting?

guzba commented 3 years ago

Hello. I do like the idea of supporting a stream interface for deflate / gzip / zlib and extending that to zip archives and tarballs instead of doing everything in memory as it is right now. This improvement is something I plan to do as a sort of a Zippy version 2. It is a quite large refactor though. It isn't something I am able to take the time to do right now.

brentp commented 3 years ago

I understand. Thanks for considering it and for the software.

HugoGranstrom commented 3 years ago

I'll chime in and say that a Stream approach would be a really useful feature, especially when working with large zip files. I've been trying to get Zippy to extract a 3GB zip file without success (out of memory) and not reading everything into memory at once would probably solve that.

And I agree with brent, great work so far! :D

ajusa commented 3 years ago

I'll also chime in and say that reading compressed game data (for example) usually means that you store all of the assets inside of a single large zip file, and decompress individual assets as needed to avoid taking up too much disk space with uncompressed assets. There are a few use cases like this where I would love to use zippy. The only alternative that I am aware of for this is physfs, which also has a bunch of filesystem wrapping.

guzba commented 2 years ago

I have just tagged a zippy release with an improved way for reading from zip files in it. See https://github.com/guzba/zippy/blob/master/examples/ziparchive_explore.nim for a quick intro. This requires zippy 0.9.0+.

The new reader returned from openZipArchive should work well for your use-case @ajusa as a way to read only those assets needed from a large compressed zip archive. It should also enable extracting larger archives @HugoGranstrom .

This is only a zip file reading API. Modifying a zip archive is more complex.

HugoGranstrom commented 2 years ago

Thanks a lot! :D Can confirm that the 2GB zip file I got Out of memory with before now unzips fine, peaking at roughly 2GB of RAM usage. 🎊