Open Jimmy-Z opened 1 year ago
Tested the attached zip file, folder.zip on my Ubuntu setup. Running a fresh minizip
$ minizip -h
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng
The zip file contains the following
$ unzip -l -O cp936 folder.zip
Archive: folder.zip
Length Date Time Name
--------- ---------- ----- ----
0 2019-07-09 13:27 folder/
0 2019-07-09 13:26 folder/新建文本文档.txt
0 2019-07-09 13:27 folder/新建文档.docx
--------- -------
0 3 files
First try listing its contents with minizip
$ minizip -c 936 -l folder.zip
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng
---------------------------------------------------
-c -l folder.zip
Packed Unpacked Ratio Method Attribs Date Time CRC-32 Name
------ -------- ----- ------ ------- ---- ---- ------ ----
0 0 0% stored 10 07-09-19 06:27 00000000 folder/
0 0 0% stored 20 07-09-19 06:26 00000000 folder/�½��ı��ĵ�.txt
0 0 0% stored 20 07-09-19 06:27 00000000 folder/�½��ĵ�.docx
I see the same encoding issue. Now use minizip to extract the contents of the zip file
$ minizip -c 936 -x folder.zip
minizip-ng 3.0.9 - https://github.com/zlib-ng/minizip-ng
---------------------------------------------------
-c -x folder.zip
Archive folder.zip
Extracting folder/
Extracting folder/�½��ı��ĵ�.txt
Extracting folder/�½��ĵ�.docx
Note the encoding issue with the Extracting...
lines
Check what was written to disk.
$ ls -l folder
total 0
-rw-rw-rw- 1 paul paul 0 Jul 9 2019 新建文本文档.txt
-rw-rw-rw- 1 paul paul 0 Jul 9 2019 新建文档.docx
That looks fine.
Looks like there are (at least) two places where the code isn't doing what is expected when the -c
option is specified.
After a brief look at the code I see that mz_os_utf8_string_create
is used to do the UTF8 encoding on the filename. That function is only called from mz_zip_reader_save_all
which is part f the extract workflow.
If would be helpful if you can submit a PR.
I got a zip file in CP936/GBK,
-x -c 936
is able to extract the file correctly, but:-l -c 936
also gave the same garbled file names.Pipe to iconv like
minizip -l a.zip | iconv -f gbk -t utf8
works.It seems `-c' doesn't affect -l in any way.