selmf / unarr

A decompression library for rar, tar, zip and 7z archives
GNU Lesser General Public License v3.0
70 stars 13 forks source link

Cannot unarchive 7z files #12

Open coolaj86 opened 3 years ago

coolaj86 commented 3 years ago

Test file: https://github.com/emukidid/swiss-gc/releases/download/v0.5r922/swiss_r922.7z

Building unarr-test

wget -O unarr-master.zip https://github.com/selmf/unarr/archive/master.zip
unzip unarr-master.zip
pushd unarr-master
mkdir build
pushd build
cmake .. -DBUILD_SHARED_LIBS=OFF -DENABLE_7Z=ON -DBUILD_SAMPLES=ON
make

Running unarr-test

./unarr-test ./swiss_r922.7z
Parsing "swiss_r922.7z":
01. swiss_r922/ActionReplay/SDLOADER.BIN (@11)
! _7z.c:135: Failed to extract file at index 11 (failed with error 4)
Warning: Failed to uncompress... skipping
02. swiss_r922/DOL/Viper/swiss_r922-lz-viper.dol (@12)
! _7z.c:135: Failed to extract file at index 12 (failed with error 3)
Warning: Failed to uncompress... skipping
03. swiss_r922/DOL/swiss_r922-compressed.dol (@13)
! _7z.c:135: Failed to extract file at index 13 (failed with error 3)
Warning: Failed to uncompress... skipping
...
selmf commented 3 years ago

7z t swiss_r922.7z

`7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=de_DE.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs Intel(R) Core(TM) i7-2670QM CPU @ 2.20GHz (206A7),ASM,AES-NI)

Scanning the drive for archives:
1 file, 3622848 bytes (3538 KiB)

Testing archive: ../test/swiss_r922.7z
--
Path = ../test/swiss_r922.7z
Type = 7z
Physical Size = 3622848
Headers Size = 715
Method = LZMA:24 BCJ PPC
Solid = +
Blocks = 3

Everything is Ok                                                     

Folders: 11
Files: 19
Size:       33161999
Compressed: 3622848
`

Method = LZMA:24 BCJ PPC

The decompression code for 7z archives is taken from the ANSI-C implementation of the LZMA SDK. As far as I can tell, that code only includes filters for x86 and IA64 but not for PPC and other architectures, which is the reason the decompression is failing.

coolaj86 commented 3 years ago

Is that to say this is an endianness issue?

selmf commented 3 years ago

No, this is simply a missing decompression method/filter. There is a good chance I can fix this by adding the missing method from 7z code, but I need to look into it.

selmf commented 3 years ago

So I just had a closer look at this and it appears the filter is not missing. The errors unarr shows indicate a problem with CRC values (Error 3) and unsupported format (Error 4). p7zip 16.02 on my machine reports it as a valid file while the ANSI-C decoder from LZMA-SDK 19.00 reports it as unsupported. This archive was likely not created with standard parameters and I need to do a more thorough investigation what exactly is happening here.

mastercoms commented 2 years ago

Seems like the archival step has changed since this issue was created (possibly prompted by the report). https://github.com/emukidid/swiss-gc/commit/db7c168351f8443c5bac5a97ab6082aad4e5feb5

selmf commented 2 years ago

Thanks for the follow-up. I checked and the new files provided by swiss-gc work fine. I am not sure why the old files do not work, but this is likely due to a bug or edge case in the lzma-sdk. Funny thing is, p7zip actually reports the same filter/compression combination for both the 'broken' and the new files on my system.