101arrowz / fflate

High performance (de)compression in an 8kB package
https://101arrowz.github.io/fflate
MIT License
2.21k stars 77 forks source link

Occasional CRC Errors When Streaming Data into Zip using AsyncZipDeflate #194

Closed Masty88 closed 7 months ago

Masty88 commented 8 months ago

Discussed in https://github.com/101arrowz/fflate/discussions/192

Originally posted by **Masty88** December 1, 2023 Occasional CRC Errors When Streaming Data into Zip using AsyncZipDeflate Context I am using fflate to fetch a list of 3D geographic files in various formats along with orthophotos in JPEG from Amazon S3. When retrieving the files, I use response.body.getReader() to stream the data into a ZIP folder. Issue When using AsyncZipDeflate or ZipDeflate (even with compression level set to 0), I encounter CRC errors intermittently - sometimes immediately, other times sporadically (about one in every two attempts). However, if I use the array buffer directly without streaming, or if I use ZipPassThrough for streaming, it works flawlessly 100% of the time. Steps to Reproduce Fetch a list of files from Amazon S3. Stream the data into a ZIP folder using AsyncZipDeflate or ZipDeflate. Occasionally encounter CRC errors in the resulting ZIP file. Expected Behavior The ZIP file should be created without CRC errors, similar to when using ZipPassThrough or directly passing the array buffer. Actual Behavior CRC errors occur intermittently when using AsyncZipDeflate or ZipDeflate for streaming data into a ZIP folder. Additional Information The files being fetched are 3D geographic files in various formats along with JPEG orthophotos. The issue seems to be specific to the streaming process with AsyncZipDeflate or ZipDeflate. StackBlitz Reproduction I have created a StackBlitz project to demonstrate the issue: https://js-7tnzqy.stackblitz.io https://stackblitz.com/edit/js-7tnzqy?file=download.js ``` import { Zip, AsyncZipDeflate } from 'fflate'; async function downloadAndCompress(urlsToDownload) { console.log(urlsToDownload); let chunks = []; const zipFile = new Zip(); const zipCompletionPromise = new Promise((resolve) => { zipFile.ondata = (err, dat, final) => { if (err) { throw err; // or handle error as you see fit } chunks.push(dat); if (final) { const blob = new Blob(chunks, { type: 'application/zip' }); const url = URL.createObjectURL(blob); resolve(url); } }; }); const downloadAndStreamToZip = async (url) => { const response = await fetch(url); const fileName = url.split('/').pop(); const fileStream = new AsyncZipDeflate(fileName, { level: 4 }); zipFile.add(fileStream); const reader = response.body.getReader(); let done = false; while (!done) { const { done: chunkDone, value } = await reader.read(); done = chunkDone; if (value) { fileStream.push(new Uint8Array(value), done); } } return new Promise((resolve) => { if (done) { fileStream.push(new Uint8Array(0), true); resolve(); } }); }; await Promise.all(urlsToDownload.map((url) => downloadAndStreamToZip(url))); zipFile.end(); return await zipCompletionPromise; } export { downloadAndCompress }; ``` ![image](https://github.com/101arrowz/fflate/assets/71265784/4ad7df84-355a-44f3-b39b-c7559519a6a2)
101arrowz commented 8 months ago

Sorry, I somehow missed your discussion post! This shouldn't be happening; have you verified this issue with other ZIP decompression software or is it only visible in Windows Explorer?

Masty88 commented 8 months ago

Thank you for getting back to me, and no worries about missing the discussion post! In the meantime, I have taken the opportunity to dive deeper into the code and conduct tests in various scenarios. However, I continue to encounter the same issue.

I've already attempted to decompress the resulting ZIP files using 7zip, but unfortunately, the problem persists with the same CRC errors. Additionally, I have checked the validity of the ZIP archive using an online verification tool, which also confirmed the presence of these errors.

From my observations, it seems that the higher the level of compression, the more frequently the errors occur. This pattern is consistent regardless of the decompression tool used, suggesting that the issue might be related to the compression process itself rather than decompression.

I hope this information might be helpful for further investigation. I am more than willing to provide any additional details that might aid in resolving this issue.

Thank you again for your support and your time.

image image

olange commented 8 months ago

While trying to open those same Zip archives that @Masty88 produced on my MacBook Pro and the standard archiver tool of the system (Sonoma 14.2), I had the issue too, I could not decompress the archives.

101arrowz commented 7 months ago

This issue is concerning. I'm really sorry I haven't gotten to it yet - I'll investigate and fix this as soon as I can.

101arrowz commented 7 months ago

Reproduced and fixed locally. Will test further to verify my change fully fixes the problem.

Masty88 commented 7 months ago

@101arrowz thank you we will wait for the release :)

101arrowz commented 7 months ago

Should be fixed in v0.8.2; let me know if you run into any issues!