brendan-duncan / archive

Dart library to encode and decode various archive and compression formats, such as Zip, Tar, GZip, ZLib, and BZip2.
MIT License
403 stars 140 forks source link

How to get a stream of decoded Data #12

Open zoechi opened 9 years ago

zoechi commented 9 years ago

I tried to find a way to decompress a zip file and get a stream of decompressed content for a file which I can then pipe to an IoSink. My expectation is that this way not the entire file needs to be decompressed entirely in RAM before it can be written to disk. Is this supported?

brendan-duncan commented 9 years ago

You can decode a zip archive, but the contents of the archive will not be decompressed until you access them.

For example, if zipBytes is the zip file contents, then Archive archive = new ZipDecoder().decodeBytes(zipBytes); will decode the zip and store it in archive. However, the contents of the zip will not have been decompressed. When you access the content of a file in the Archive, such as List jpg = archive.findFile('cat.jpg').content; then the data of that file will be decompressed on demand. The other files in the zip remain compressed until they are likewise accessed.

Is this what you are asking?

zoechi commented 9 years ago

Thanks for the explanation. I'd like to know if for the last step (getting the content), is there a way to get a stream instead, so that I don't have to store the entire decompressed file content in memory, but instead get a stream which I can connect to th IoSink (of a file for example), so that each decompressed chunk written to the stream is immediately written to the file (or wherever the stream was bound to)

brendan-duncan commented 9 years ago

Not currently, but it's a good idea.

zoechi commented 9 years ago

Yup, would be great for big files. The package is quite useful already, thanks for sharing :)

brendan-duncan commented 9 years ago

FYI, I do plan on adding this, but it's been a long crunch time at work, haven't had any personal programming time in a bit. Hopefully will get some time soon.

zoechi commented 9 years ago

Great! No rush though. Currently I call external unzip but I would prefer doing it entirely in Dart, not at least to have a platform-neutral solution.

bergwerf commented 7 years ago

This would be pretty nice :-) Maybe you can use a Sink. (if I understand correctly the converters in dart:convert are using sinks, GZIP in dart:io uses ByteConversionSink). I'm decoding large density map files (~400mb).

zoechi commented 7 years ago

@hermanbergwerf see #35

bergwerf commented 7 years ago

@zoechi a RandomAccessFile specific implementation would not work with data that is retrieved over http though

zoechi commented 7 years ago

@hermanbergwerf it sounded to me like it would use a stream so it doesn't need the whole archive in memory. I just guessed that this wouldn't be far away from streaming in general. But lets just see what @brendan-duncan has to say about it.

brendan-duncan commented 7 years ago

The challenge here is to have an API that foremost works without dart:io in order to handle web apps, but can also handle file streaming, large archives, minimizing memory usage for servers such as pub, etc. It's much more of a challenge than I anticipated when I started writing this library however many years ago. I think doing a big conversion to sinks or other such elegant API is appropriate for a 2.0. In the mean time, I have some of this working now but still have more to finish, and I'm doing my best to carve out time to work on it but I admit it's been much slower than I would like.

bergwerf commented 7 years ago

I understand. Thanks for your work! (the image library is also pretty cool!)

droplet-js commented 3 years ago

any news update?