thejoshwolfe / yauzl

yet another unzip library for node
MIT License
735 stars 80 forks source link

Error: invalid central directory file header signature when unzipping big files, zipped by mac os #69

Closed d-mg closed 9 months ago

d-mg commented 6 years ago

Using the terminal to zip them there is no problems with unzipping. But when I use the GUI for zipping, and only with bigger files (around 8 GB and up) I get the following error:

Error: invalid central directory file header signature: 0x2d8c9fa4 at /Users/dmg_/projects/test-zipper/nodemodules/yauzl/index.js:253:70 at /Users/dmg/projects/test-zipper/nodemodules/yauzl/index.js:622:5 at /Users/dmg/projects/test-zipper/node_modules/fd-slicer/index.js:32:7 at FSReqWrap.wrapper [as oncomplete] (fs.js:658:17)

Also when running zipinfo I get:

Archive: Archive.zip Zip file size: 5946143505 bytes, number of entries: 140 warning [Archive.zip]: 4294967296 extra bytes at beginning or within zipfile (attempting to process anyway)

I did some digging through issues and some searching and I'm pretty sure these extra bytes make it that Yauzl can't find the header of the central directory.

Do you have any clue why GUI Mac OS zipped zipfiles have these extra bytes? I do see it puts some extra '__MACOSX' folders in the zip and suspect the presence of these to have something to do with it, but that's more of a guess.

thejoshwolfe commented 6 years ago

Thanks for the report. Is it possible to provide a download for a zip file that reproduces this problem? You mention it happens with very large files, like 8GB, so perhaps hosting a download would be infeasible.

Alternatively, could you provide minimal steps to reproduce the issue? Perhaps something like:

  1. head -c 8000000000 /dev/urandom > file.bin
  2. right click on file.bin, and choose something.

I don't have a Mac, so i'd be forwarding these instructions to a friend who can transfer me an 8GB file easily.

thejoshwolfe commented 6 years ago

Do you have any clue why GUI Mac OS zipped zipfiles ...

An easy answer is: yes, I have a clue. The Mac Archive Utility is a piece of trash when it comes to the zip file format. This will be the fourth (I believe) compatibility issue that boils down to Apple getting the zip file format embarrassingly wrong.

I'm not surprised at all that this is an issue. In fact, I previously thought that Archive Utility was simply unable to handle zip files larger than 4GB due to lack of ZIP64 support. Are you sure the zip files that you create are readable by anyone? Like, can the GUI itself read its own files that it creates that are 8GB in size? If so, then there's some strange format at play that isn't ZIP64 format.

d-mg commented 6 years ago

Did some more testing and it seems to fail somewhere between 4.19 and 4.33 GB.

Since I'm trying to zip 4k mp4's I used ffmpeg to create files (warning, don't watch them with your volume opened completely) https://www.ffmpeg.org/download.html

this creates a 32 seconds mp4 (4.19 GB) ffmpeg -f rawvideo -video_size 3840x2160 -pixel_format yuv420p -framerate 25 -i /dev/urandom -ar 48000 -ac 2 -f s16le -i /dev/urandom -c:a aac -t 32 output32secs.mp4

this creates a 33 seconds mp4 (4.33 GB) ffmpeg -f rawvideo -video_size 3840x2160 -pixel_format yuv420p -framerate 25 -i /dev/urandom -ar 48000 -ac 2 -f s16le -i /dev/urandom -c:a aac -t 33 output33secs.mp4

The 32 seconds one unzips fine with yauzl, the 33 seconds (or any larger) one fails with: Error: invalid central directory file header signature: XXXXXXXX.

I zip them by simply going into Finder on my Mac, right clicking the file and choosing compress "output33secs.mp4". The resulting zip file causes the error when unzipping with Yauzl. Zipping them through the command line zip somezip output33secs.mp4 creates a zip file that Yauzl unzips just fine.

Hope this is enough to reproduce the problem, if not let me know.

thejoshwolfe commented 6 years ago

Thanks, I've reproduced the issue. Indeed, the Info-ZIP unzip command line program can't read this file correctly either, so this issue is not specific to yauzl.

I'll do some research and tell report back here. (It may take a few weeks.) I'm curious to see what the hell Archive Utility did that it can read its own malformed zipfile but no one else can.

thejoshwolfe commented 6 years ago

I've determined what the issue is. It's pretty much the same problem as #60.

When the Mac Archive Utility needs to write 32-bit values in the metadata, it doesn't check for overflow and simply writes the lower 32 bits of the value. There are 3 values that get corrupted like this in the case of archiving a file that's over 4 GB in size: the "offset of start of central directory with respect to the starting disk number" in the end of central directory record and the "compressed size" and "uncompressed size" fields of the local file header and central directory file header (and data descriptor, but that doesn't matter too much).

This means that the metadata is an outright lie. This is absolutely a bug in Archive Utility, and it's a nasty one too. The correct behavior would have been to use ZIP64 format, but Archive Utility has no idea what that is.

There may or may not be any way to correctly and reliably recover from this corruption. I'm still thinking about it.

d-mg commented 6 years ago

Thank you for looking into it. Also I noticed that sometimes the unzip CLI tries and unzips part of the file but they you lose half of the video file and it's completely not reliable. Hope you do find a solution, but in the meantime my workaround is just avoiding the Archive Utility. I did send a report to apple, if they get back to me with any relevant information I will share that here as well.

thejoshwolfe commented 6 years ago

Oh. How did you do that? I was unable to find any bug tracker for either Archive Utility or the ditto command line program, which is allegedly the backend for Archive Utility's zipfile functionality.

If Archive Utility's zipfile implementation were open source, i could even make a pull request fixing their bugs, but I didn't find that either.

d-mg commented 6 years ago

I reported it here: https://www.apple.com/feedback/macos.html It's pretty much just sending feedback for macOs, they state that they will look at everything sent in there but that they might not get back to you. It's a long shot but who knows if people start nagging they might fix it...

overlookmotel commented 6 years ago

@d-mg Out of interest, what version of Mac OS did you create the faulty zip file on?

Absolutely bonkers that Mac OS has such a broken ZIP implementation, especially as it's meant to cater for "creatives" who are the most likely people to be zipping up large (e.g. video) files.

@thejoshwolfe Are you saying that Mac OS Archive Utility does not make use of ZIP64 extra fields at all?

overlookmotel commented 6 years ago

If the number of files in the ZIP is not enormous, I guess that "offset of start of central directory with respect to the starting disk number" can be deduced from the total length of the ZIP file and lower 32 bits of the offset recorded by MacOS.

Each central directory file header is max 196654 bytes (46 + 65536 + 65536 + 65536). So if the ZIP contains less than 21000 files (approx), the central directory cannot be bigger than 4GiB. So then the low 32 bits are enough to determine the location of start of the central directory.

That doesn't solve the problem of "compressed size" and "uncompressed size" (and also, presumably, the "relative offset of local file header") fields also being truncated to 32 bit. But I guess you could use the info that is available to step through the file in 4 GiB jumps to check all possible locations for matching local file headers.

Ergh! But do you think this would work?

My work involves receiving a lot of large ZIP files from sources all over the place, and I sadly have no way to prevent them being created with Mac OS Archive Utility (I suspect most are), so I would be motivated to help solve this problem.

I also have a massive collection of large ZIP files from varied sources to test an implementation on.

d-mg commented 6 years ago

@overlookmotel I was using High Sierra for this

thejoshwolfe commented 6 years ago

Ergh! But do you think this would work?

I developed a plan with a lot of the same ideas you presented to make this work, but it's very difficult to implement. The gist is to track a non-deterministic multistate of possibilities depending on what kind of corruption the zipfile has.

The user would need to explicitly request this complexity for three reasons: I don't want everyone to pay the performance cost of Apple's incompetence/hostility if they don't want to; the user needs to use the library differently when this complexity is enabled; the complexity can change the correct interpretation of a correct zipfile to be incorrect in rare cases.

The Mac Archive Utility behaves predictably, albeit incorrectly, when laying out data in the zipfile, so I can check if the zipfile is deviating from the Archive Utility-style. If we know we aren't dealing with Archive Utility, we can drop the complexity and behave normally. However, we can't be certain that we are dealing with Archive Utility until we verify that one of these overflow corruptions has happened.

Each case of corruption can be detected heuristically by seeking to the locations that would cause the overflow to give us the value we see, and checking the zipfile data structure signatures. This is not a perfect solution, because there may be a valid zipfile that perfectly mimics the Archive Utility corruption. That's the main problem with this Archive Utility bug is that it creates "valid" zipfiles with the wrong content. So whether we should assume that it's corrupted or correct is a guess, and that's one of the reasons why the user would have to explicitly ask for this feature to be enabled.

The probability of this "false positive" detection happening randomly is low, so it's not the worst thing about this idea.

The worst part is that this it overall sucks. It's really complicated, it's going to negatively impact performance at least a little bit for everyone, and it's all due to flagrant incompetence/hostility at Apple being widely adopted. My motivation is really the limiting factor for implementing a fix for this, but hearing people say that this is important to them really helps. :+1:

@thejoshwolfe Are you saying that Mac OS Archive Utility does not make use of ZIP64 extra fields at all?

Yes. I haven't seen the source code, because I believe it's closed source, but all of my experience and experimentation with it suggest that the implementation is oblivious to the ZIP64 format entirely, and apparently also oblivious to the concept of integer overflow and bounds checking.

overlookmotel commented 6 years ago

@thejoshwolfe Thanks for your lengthy reply.

I agree that it totally sucks! But, at least for me, it would be really useful to resolve.

Complexity: To my mind, this is the biggest downside. It would make the code considerably more complex. I imagine it would likely require a refactor to split different steps of file parsing into separate functions, so functions can call themselves recursively as they try different possibilities. That's a lot of code churn.

Performance: As far as I can see, I doubt that it would impact performance in any significant way for non-Apple ZIP files. The workarounds for the Apple case, I think, only need kick in if a valid Central Directory Record cannot be found in the location it should be. So the execution path for non-Apple archives could remain the same.

False positives: I hope possibility of false positives could be reduced to zero, or at least to an infinitesimal level. I hope (but don't know yet) that Archive Utility ZIP files can be "fingerprinted" in some way. For example, maybe they always set the "version made by" field to a particular value which no-one else uses. If so, then the wacky heuristics could be disabled if it's not an Archive Utility file - so anything not made by Archive Utility would behave as before, and the only files which would be parsed differently would be ones which at present yauzl can't deal with anyway.

Explicit enabling: I agree this should have to be explicitly enabled by the user.

I would like to do some digging on this and see what I can find.

In principle, if (1) I can come up with something half-sensible which doesn't affect performance significantly or risk screwing up valid ZIP files and (2) am willing to do a lot of the work on implementation, would you likely be willing to accept the downside of additional code complexity in order to support Archive Utility files? (I totally understand if you don't)

overlookmotel commented 6 years ago

@d-mg Thanks for letting me know.

thejoshwolfe commented 6 years ago

would you likely be willing to accept ...

To be honest, since you asked, probably not. The effort to thoroughly review a pull request of this magnitude would be on par with authoring the code from scratch. So I'd rather just write it myself, if that makes sense. There's an awful lot of careful design and consideration that needs to be done to ensure this change doesn't regress anything already in place, such as performance and API-level corner case behavior (neither of which are very well tested automatically here).

But thanks for offering. Your enthusiasm is becoming my enthusiasm, which is the currency that funds this project. I'll revisit this issue in a few weeks after I work through some major non-software issues I'm in the middle of at the moment. Stay tuned for updates.

overlookmotel commented 6 years ago

OK, fair enough. I've got really interested in this, and was looking forward to getting my hands dirty, but hey! I really appreciate it that you're willing to implement.

In case it's useful, I did a little digging into how Archive Utility structures ZIP files.

Structure of ZIP files

Structure:

Fingerprinting:

All of the above hold true for both ZIP files created by graphical Archive Utility and command line ditto. I can't find any obvious preferences or other options which change these behaviors.

Proviso: I've not checked that many files, and this is all on one machine running Mac OS Sierra 10.12.6. It's possible behavior may vary on different versions.

What does this mean?

The consistencies in structure should make the file easier/more reliable to parse than it might otherwise be:

File Header 1 is at byte 0. File Header 2 is immediately after end of file 1 Data Descriptor, and so on. So to get the correct locations and compressed lengths of each file, it should only be necessary to seek for the Data Descriptor for each file, firstly in location where it should be, and then in 4GiB steps onward from there until the Data Descriptor is found. Data Descriptor is 16 bytes, so risk of false positives is very small.

Of course, there's no way of knowing the size of uncompressed data without uncompressing it.

Also, hopefully some of these attributes can be used to "fingerprint" Archive Utility's ZIP files.

Overall, this makes me hopeful that implementing this will not be too much of a nightmare, and should be pretty robust in terms of avoiding screwing up valid files.

What I'm going to do next

I have a large collection of ZIP files from a bunch of different sources (by accident - just haven't had time to clear them off my drives). Many of these will no doubt have been created with Archive Utility, and many are 4GB+. I also have the unzipped contents of each (unzipped with various different tools). In many cases the unzipped files are in formats which have built-in checksums so I can be confident they've been unzipped correctly.

I plan to run yauzl on all of these files and see what it fails on. That'll give us a corpus of files to test the new version on once Archive Utility workarounds have been implemented.

I'll let you know where I get to.

NB I have no idea what the "major non-software issues" you are facing at the moment may be. But just to say, please don't feel the need to reply. I appreciate the attention you've given this issue, and it can wait until you're back in action.

overlookmotel commented 6 years ago

I've done some work on this.

Implementation

I'm afraid I couldn't resist getting into it fully. I've made an implementation that parses faulty Mac Archive Utility ZIP files. It is here: https://github.com/overlookmotel/yauzl-mac

It's not published on npm as I intended it as a proof of concept and testing ground for seeing if this could work. Obviously I hope it'll be useful to you in adding an implementation of this feature to yauzl proper.

My implementation is very hacky as most of yauzl's guts are not exposed on the public API. So, for example, it has to use events-intercept to capture emitted 'entry' events and do some more work (locating the end of the file) before re-emitting the event.

The guts of it are in:

It depends on a change to yauzl which exposes more details about the Central Directory on ZipFile objects which I've submitted as PR #78.

The modifications don't do anything unless .open() is called with supportMacArchiveUtility option set.

Testing

I've tested it out on a collection of 331 random zip files. Most are large (10GB+), and about 150 of them are Mac Archive Utility-created ZIPs which yauzl was not able to unzip.

With the implementation linked to above, it can unzip all of the files except 3. All of these I've compared to the result from unzipping with command line unzip, ditto and unar (had to use different programs to successfully unzip different files) and they all come out identically.

Of the 3 failing ZIP files, 2 are seemingly Mac Archive Utility ZIP files, but they have no Central Directory at all. The files end straight after the last Data Descriptor. These are fully corrupt and I don't think yauzl should attempt to unzip them. I don't know how these files were made but I wonder if the zipping process errored out before finishing. Weird!

The last failing file is a strange case. It's not a Mac Archive Utility file, but I haven't figured out what's wrong with it yet.

The vast majority of the files I've tested are ZIPs containing less than 10 files. However, this implementation also successfully extracts all the files from the ZIP file linked to in #60 which has 71,609 files.

Please let me know what you think when you get a chance.

thejoshwolfe commented 6 years ago

@overlookmotel awesome! I took a look at your https://github.com/overlookmotel/yauzl-mac , but I didn't quite get to a point where I understand how it works.

Those test results are pretty impressive.

The files end straight after the last Data Descriptor.

Out of curiosity, how are you examining the corrupt zipfiles to determine things like this? I've been using a tool I made (that's undergoing an overhaul and translation into a native language here: https://github.com/thejoshwolfe/hexdump-zip ). Are you just using a general hexdump/viewer or are you using something specifically for zipfiles?

overlookmotel commented 6 years ago

Thanks!

As you said, Archive Utility is faulty, but faulty in an entirely predictable way. So actually, it wasn't hard to get it to parse the faulty files.

My implementation is more complex than it should be only because it's doing all kinds of weird things to monkey-patch yauzl from the outside. It'd be much cleaner implementing inside yauzl. If you're having trouble disentangling how my code works, please just shout.

Out of curiosity, how are you examining the corrupt zipfiles?

I was just using a file viewer (HexFiend on Mac OS). That's a pain, obviously, but I've not had to do much of it. It's easy to spot that the Data Descriptor is at the end of the file as it's so short.

hexdump-zip looks good. I'll have a play (once I've got my head around installing zig).

overlookmotel commented 6 years ago

By the way, if you'd prefer I release the Archive Utility parsing capability as a separate npm module, I'd be happy to do that. Or maybe that could be a temporary stopgap if you're not going to have time for this in near future?

If so, could you possibly merge #78 to make that possible?

thejoshwolfe commented 6 years ago

Sorry for the delay. I'm going to work on this issue next week.

It's easy to spot that the Data Descriptor is at the end of the file as it's so short.

I discovered that Archive Utility does not consistently put Data Descriptors at the end of file contents. I can't see a pattern for when they are there and when they aren't.

once I've got my head around installing zig

A lot of effort has gone into making this painless for Linux and Windows. Unfortunately Mac is the difficult OS to get a Zig compiler on. If you have one of those other OS's, it might be worth downloading the precompiled, dependency-less binary that's auto built from zig's master branch here: https://ziglang.org/download/

overlookmotel commented 6 years ago

Sorry for the delay. I'm going to work on this issue next week.

No worries!

I discovered that Archive Utility does not consistently put Data Descriptors at the end of file contents. I can't see a pattern for when they are there and when they aren't.

The pattern I've seen is:

Symlinks are stored like a file with the file content being the path the link points to.

Local File Headers:

Central Directory:

Weird that the Extra Fields in the Local File Header and Central Directory don't match!

I haven't found any Archive Utility zip files that don't fit this pattern (apart from the 2 I mentioned which have no Central Directory at all).

Have you seen files which don't fit this pattern?

overlookmotel commented 6 years ago

A lot of effort has gone into making this painless for Linux and Windows. Unfortunately Mac is the difficult OS to get a Zig compiler on. If you have one of those other OS's, it might be worth downloading the precompiled, dependency-less binary that's auto built from zig's master branch here: https://ziglang.org/download/

Ah bugger. I am Mac!

andrewrk commented 6 years ago

@thejoshwolfe you can cross compile a macos hexdump-zip binary like this:

zig build-exe --library c src/hexdump-zip.zig --target-os macosx --target-arch x86_64

I just tried it, it works. Here it is hexdump-zip.zip

overlookmotel commented 6 years ago

@andrewrk Unfortunately I can't run that on Mac OS Sierra 10.12.6. It errors. But this isn't the right issue to list the errors etc. Where is?

thejoshwolfe commented 6 years ago

Extra fields always contains an entry with ID 22613 (values differ, and no idea what this is)

This is mentioned in the spec:

0x5855        Info-ZIP UNIX (original, also OS/2, NT, etc)

This is a struct that stores some unix file info, like mtime, atime, and uid.

BenV commented 6 years ago

Hey all, I was wondering if there was any additional work planned on this problem? We are running into a similar issue with these ZIPs from the Archive utility, and it would be great if there was a way to extract them. Happy to help out in any way.

thejoshwolfe commented 6 years ago

Hi all, sorry for the slow progress here.

I've written up a huge amount of experimental prototype code to handle this issue, but it's not stable enough for a release yet. This is an extremely difficult problem to reason about, and i got burnt out on it 5 months ago. I should be returning to work on this in a few days.

This is also related to #84.

BenV commented 6 years ago

Thanks so much for the quick reply. Totally understandable to get burnt out on such a problem, I know how it goes. Keep fighting the good fight and know your effort is greatly appreciated! Please don't hesitate to reach out if you need volunteers to test the new code out.

andrewrk commented 6 years ago

It's kind of crazy that Apple is too busy counting their cash to fix this inexcusable bug, and @thejoshwolfe is working for free on an elaborate workaround. Maybe Apple should be funding this project, or maybe they should fix their god damn software.

rossj commented 6 years ago

I, for one, would donate / tip / beerpay to show @thejoshwolfe some love. These zip problems are old and persistent, and it its clear to me from the comments in this project and others, that @thejoshwolfe is THE person with the knowledge and tenacity required to tackle these fiddly bits and make the best zip library - one that is strict yet accommodating, low-level yet user friendly.

overlookmotel commented 6 years ago

@BenV @andrewrk @rossj For your info, I have implemented Mac Archive ZIP file handling at https://github.com/overlookmotel/yauzl-mac.

It's not on npm, but you can install as a git dependency by putting this in your package.json:

"dependencies": {
  "yauzl-mac": "git://github.com/overlookmotel/yauzl-mac"
}

Then use yauzl-mac in place of yauzl and set the the option supportMacArchiveUtility: true when calling .open(), .fromFd(), .fromBuffer() or .fromRandomAccessReader().

e.g.:

const yauzl = require('yauzl-mac');

yauzl.open('path/to/file.zip', { supportMacArchiveUtility: true }, function(err, zipfile) {
  /* ... */
} );

I've not written tests, but I did run it on a collection of thousands of ZIP files and it works on all of them, including hundreds of Mac Archive ZIP files which yauzl cannot handle. So it has been well-tested.

I have not released this on npm as it was intended as a base on which @thejoshwolfe could create his own implementation as part of yauzl. But it does work.

overlookmotel commented 6 years ago

Have released this on npm now - yauzl-mac.

Please let me know how you get on.

overlookmotel commented 6 years ago

@thejoshwolfe Would you please consider merging #78? At present yauzl-mac has to depend on a fork of yauzl to access some details of the central directory that it needs to parse the file. Obviously it'd be ideal if could depend on yauzl itself instead instead.

BenV commented 6 years ago

@overlookmotel Thanks so much for publishing this to npm, I had been following your efforts and planned on using your module as a stop-gap until this library officially added support. I will plan on testing this out hopefully this week and let you know how it goes. I have high hopes since it sounds like you've done pretty extensive testing already.

thejoshwolfe commented 6 years ago

@overlookmotel I gotta admit https://github.com/overlookmotel/yauzl-mac is really hard to read. I wonder if just forking yauzl would have made more sense. Ah well.

I believe this is the set of fields we have to check for overflow:

Did I miss any?

The minSize/maxSize calculation you're doing is intriguing, but I'm worried that it's just too complicated for yauzl. If we're going to heuristically support Mac Archive Utility zipfiles, we should enable it by default, and it can't be perfect. There's only so many guards that we can put in to prevent falsely identifying a Mac zipfile, and the min/max heuristic might not be worth the benefit.

One thing I'd really like to get going with this is the ability to test code without lugging around enormous zip files. I've come up with a plan for a test harness that should fit reasonably into git and hopefully be performant enough to run by default with npm test.

I didn't make as much progress on this tonight as I was hoping, but I'll continue work on this shortly.

overlookmotel commented 6 years ago

Yes, I'm sorry it is hard to read.

Probably I should have just forked yauzl. However, I generally avoid forks as then there's a burden of keeping it up to date, and you'd made it clear that you wouldn't accept a PR as you wanted to implement yourself (fair enough). Really I was just doing it because I got interested and wanted to solve the problem for the sake of solving it, so I did it as code external to yauzl.

I think the fields you list above are the only ones that need checking - except a few others if you try to fingerprint Mac OS ZIP files.

My logic with the fingerprinting of Mac OS Archive files is:

The minSize/maxSize logic is perhaps over-complicated. Just so you understand my motivation: I am primarily using yauzl on ZIP files which are not stored locally, so every IO call requires an HTTP request. Therefore the cost of trial-and-error searching imposes more cost than it would for the average user, and I wanted to keep it to absolute minimum.

It'd be completely legitimate for you to decide that the complexity vs performance trade-off isn't worth it for the common use case and to skip this.

Testing: Yes, I've struggled to come up with a good way of writing tests without having to have massive ZIP files to test it on. Very intrigued as to what solution you've come up with!

Just to say, I am more than happy to help with this. I would happily turn my implementation into a PR, but you've said you don't want that. But if you'd like a 2nd pair of eyes on your implementation, or to talk you through my spaghetti code, please just say.

overlookmotel commented 6 years ago

Also, when you do have a working implementation, I'd be happy to run it on my large collection of ZIPs to flush out any bugs.

overlookmotel commented 6 years ago

@BenV Just so you know, yesterday I rebased PR #78 (which yauzl-mac depends on) onto latest yauzl master, and I haven't tested again since then.

I think it's unlikely, but it's possible that might have introduced a bug. If it has, it'll be a syntax error / total crash kind of bug, not a nasty insidious one that wrecks your ZIP files - so don't panic! But if you find any problems please post an issue and I'll fix asap.

BenV commented 6 years ago

@overlookmotel I attempted to use your library, but ran into some problems due to the fact that it is reading entries before I have a chance to install my own error handler.

We have some other invalid zips that have absolute paths and we want to ignore those entries, and unfortunately that no longer works when I use yauzl-mac since the absolute path errors are thrown before I have a chance to install my error handler.

I'll play around with it a bit more and see if I can come up with a workable solution, but if you have any ideas of what I should do to move forward it would be appreciated.

BenV commented 6 years ago

@overlookmotel I was able to get things working by adding a hacky errorFilter option into a yauzl fork that allowed me to ignore certain errors and automatically call readEntry again, which prevents the error event from being emitted when I don't want it to (for absolute/relative paths), so I now have a yauzl-mac library that has all the fixes I need in it. I'll let you know if I run into any issues.

overlookmotel commented 6 years ago

Hi @BenV.

Could you possibly raise this as an issue at https://github.com/overlookmotel/yauzl-mac/issues ? I don't quite understand what you're saying, so keen to ask you questions and fix it. But don't want to crowd this thread with talk about something not directly related to yauzl.

thejoshwolfe commented 6 years ago

Very intrigued as to what solution you've come up with!

See #92

overlookmotel commented 1 year ago

For anyone facing same problems with Mac OS ZIP files, please see yauzl-promise.

v4, just published, is a from-scratch reimplementation of yauzl with a promise-based API, and supports Mac OS ZIP files.

thejoshwolfe commented 9 months ago

Sorry yall. This issue completely killed my enthusiasm for this project. I have finally returned, but the cost was that i'm never going to support Mac Archive Utility corrupt zip files.

Please let me know if anyone finds a way to file a bug report against Apple, and will advise people to upvote it or something.

overlookmotel commented 9 months ago

Hi @thejoshwolfe. I'm really sorry. I feel responsible.

In the intervening time, I've published a fork yauzl-promise which does support Mac OS faulty zips, but also changes API to one based on promises, and makes some other changes. That's gained a bit of traction, but the download numbers are completely dwarfed by those of your original yauzl. So I'm sure many will prefer the original API, and will massively appreciate you being back on the scene.

Again, sorry I pushed this issue and contributed to you losing enthusiasm. Very glad you're back.

danielweck commented 9 months ago

Thanks @thejoshwolfe (and @overlookmotel too) for supporting/maintaining this open source codebase. Best wishes, Dan

thejoshwolfe commented 9 months ago

@overlookmotel you are absolutely not responsible for my feelings, haha. You've done great work here. I really appreciate your enthusiasm. If anyone's responsible for my burnout, it's me for not being more empathetic to myself (therapy is great, yall; I highly recommend therapy.). And the source of the burnout isn't anyone in this community; it's Apple. :slightly_smiling_face:

overlookmotel commented 9 months ago

Thanks. That's kind of you. And +1 on therapy!

nrutman commented 8 months ago

@thejoshwolfe and @overlookmotel, I wanted to follow-up on this thread because I'm using yauzl and coming up against the same OSX issues described in this thread. With the increasing adoption of Macs/OSX, I'm concerned that the users of my app may try to use OSX-generated zip files and see the problems that this thread indicates.

@overlookmotel, I mostly wanted to see if you are continuing to maintain yauzl-promise and whether some of the recent changes to yauzl will make their way there, or whether that library is no longer being maintained. (It looks like nothing was committed in the past 10 months.)

Unfortunately, it seems like Apple's problems are a reality for anyone who wants to provide unzip functionality in the marketplace. I'm just trying to figure out the best way forward for my project.