thejoshwolfe / yazl

yet another zip library for node
MIT License
341 stars 45 forks source link

the only way to kown finalSize is `compress = false` ? #31

Closed luoyjx closed 7 years ago

thejoshwolfe commented 7 years ago

yes. well, you can wait for the entire stream to finish piping if you want, and then measure that size, but the only way to know in advance how big the output will be is to not use compression (and some other stuff mentioned in the readme).

the reason for this is due to the fundamental nature of lossless compression. data can be more or less compressible depending on the values of the bytes. for example, a file contains the letter 'A' repeated 1 million times will compress down very small, but a file containing a string of random bits will (probably) not compress at all. the only way to tell those cases apart is to attempt compression---actually do the hard work. this means that we only know the size after we've finished doing the compression, which is after the zipfile is completely done being created (well, except for the central directory, but that's typically a small portion of the zipfile).

if you're in a situation where you must know the size of a zipfile before, for example, serving it to an http client, then you must buffer the file to disk first (give the client a "please wait" message in the process), then serve the file as a regular file, and then delete the file. I believe this is what amazon does for downloading music. An alternative to this is to simply not say the size of the file in the http header, and so the browser will start downloading an unknown-size file. i believe this is what happens for github archive downloads. and of course, you could disable compression and get immediate streaming with final output soze prediction. this is what groovebasin does forusic downloads.

luoyjx commented 7 years ago

thks for that.

the situation is that i want to show zip progress, but i found it impossible to get the final size, so i'm using cli-spinner to show the zip status instead currently.