max-mapper / extract-zip

Zip extraction written in pure JavaScript. Extracts a zip into a directory.
BSD 2-Clause "Simplified" License
391 stars 127 forks source link

Extracting specials characters from OSX archive #132

Open avallete opened 2 years ago

avallete commented 2 years ago

Hi there,

Using extract-zip with multiples platforms, our team found out that an underlying bug into yaulz (which is itself caused by OSX archiver bug) cause OSX archives containing special characters to have wrongly encoded filenames:

See:

https://github.com/thejoshwolfe/yauzl/issues/84 https://github.com/thejoshwolfe/yauzl/issues/69

Issue being that extract-zip itself work using the default { decodeStrings: true } parameter, which make it loose the original buffer.

I've implemented a fix here: https://github.com/Clovis-team/extract-zip/commit/a68d7657bc2e1bf1711ee5c79893464eebeb7cad

The idea is to introduce chardet as a 'fallback' for the missing utf-8 flag, and if his detection have enough confidence that the passed buffer is indeed utf-8, pretend that the "utf-8 flag" has been set.

I'm not sure it's the best way to go with this problem, so I'm open to suggestions on other ways to handle this issue. I'll probably also propose the same fix into yaulz directly so maybe in near future we could remove this patch by simply upgrade yaulz to latest version. But in the meantime, this could already fix the issue for this package.