adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.28k stars 480 forks source link

"Illegal byte sequence" when using unzip on macOS High Sierra to extract a file with Cyrillic characters #315

Open alex-swiftify opened 7 years ago

alex-swiftify commented 7 years ago

When trying to extract repacked.zip (created with SharpCompress) with macOS unzip (UnZip 6.00 of 20 April 2009, by Info-ZIP), the file with cyrillic characters in its name (Векторный смарт-объект-3.png) fails to extract:

MacBook-Pro-Alex:_ alex$ unzip repacked.zip 
Archive:  repacked.zip
error:  cannot create icredible_mockAPI_version/ICredible/Assets.xcassets/Images/who_rated_me.imageset/??????????????? ???+?????-?????????-3.png
        Illegal byte sequence

The original zip archive containing the same file (likely compressed with macOS "Archive Utility")
original.zip extracts with unzip just fine.

This problem appeared since I have upgraded to macOS High Sierra (= was not present on macOS Sierra). Is there anything we can do with the file encoding (i.e. use Unicode) to have it properly unzipped by macOS unzip?

adamhathcock commented 7 years ago

There is the ArchiveEncoding that is defaulting to UTF8. I should revisit the spec to see about flagging more encoding if that's possible.

It's likely all of zip encoding needs to be revisited. It is odd though that it's only broken with High Sierra. I guess unzip got more strict.

alex-swiftify commented 7 years ago

There is the ArchiveEncoding that is defaulting to UTF8. I should revisit the spec to see about flagging more encoding if that's possible.

Thanks! Can we currently set the encoding to anything else (i.e. UTF16) without changes to SharpCompress?

It is odd though that it's only broken with High Sierra. I guess unzip got more strict.

Since the "unzip" utility is dated 20 April 2009, I'm pretty sure this is related to upgrade to APFS file system that was introduced in High Sierra.

adamhathcock commented 7 years ago

Yeah you can change the encoding to anything. It's just if unzip understands/expects it or not. The code is only writing out a specific flag for UTF8

HadrainChen commented 6 years ago

Has anyone fix?

mrietveld commented 6 years ago

Use open, as in open fileWithUnicodeCharacters.zip. It looks like open will call an internal OS X program that has no problem open these type of .zip files.

alex-swiftify commented 6 years ago

Replacing unzip utility used in our script with ditto as suggested here fixed the problem.

Jack-ym commented 5 years ago

I have met the same question,have you had the correct solution?

LeSaul commented 4 years ago

I had the same issue and tried to unzip and open the file through terminal but the quick solution was ask to a workmate that use windows help to expand the file. Maybe the file folder is too long or idk why it's happening. Cross-platform issues?

tatoalo commented 4 years ago

Just happened to me today.

@LeSaul, instead of relying on a workmate you could maybe use this, worked like a charm.

dbogatov commented 4 months ago

Install latest unzip from brew brew install unzip.