ashang / unar

unar mirror for maintenance to build on some distros
https://unarchiver.c3.cx/commandline
Other
40 stars 6 forks source link

Select OEM code page for .zip archives based on system locale #4

Open unxed opened 4 years ago

unxed commented 4 years ago

As my investigation shows, .zip archives created on dos/windows use OEM charset corresponding to system locale. So to open them correctly without manual charset seletion we should implement the following logic:

  1. Does .zip file have UTF8 flags? Assume UTF8.
  2. Has .zip file been created on dos/windows ("0" or "11" in HostOS field)? If so, assume OEM.
  3. Assume UTF8 (it's probably archive created on Mac OS X with UTF8 flag not set, but actually UTF8).

As for OEM code page, we may select it based on system locale as windows does. Code pages table can be taken from wine.

Wrote a patch for p7zip implementing this logic. It is very simple. Hope it can be easily ported to unar. https://github.com/unxed/oemcp/blob/master/p7zip_oemcp_ZipItem.cpp.patch

Original discussion: https://github.com/mate-desktop/engrampa/issues/5