aonez / Keka

The macOS & iOS file archiver
https://www.keka.io
4.88k stars 241 forks source link

7z: Trailing garbage #1521

Open systemcrash opened 1 month ago

systemcrash commented 1 month ago

Configuration

Describe the bug

Try any of these: https://downloads.openwrt.org/releases/23.05.5/targets/x86/64/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz

OS: Version 13.6.9 (Build 22G830) (x86_64)
Keka: v1.4.4-r5475 (WEB) (Sandboxed) (en-GB)
Format detected: GZIP
Binary used: keka7zz
Arguments: (
    x,
    "/Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz",
    "-snld",
    "-aou",
    "-xr!__MACOSX",
    "-bsp1"
)

7-Zip (z) 24.08 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-08-11 : Modified by aone for Keka
 64-bit locale=en_US.UTF-8 Threads:8 OPEN_MAX:2560

Scanning the drive for archives:
  0M Scan /Users/asdf/Downloads/

1 file, 11538909 bytes (12 MiB)

Extracting archive: /Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz

--
Path = /Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz
Type = gzip
Headers Size = 10

  0%

Sub items Errors: 1

ERROR: There are some data after the end of the payload data : openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img

Archives with Errors: 1

Sub items Errors: 1

OS: Version 13.6.9 (Build 22G830) (x86_64)
Keka: v1.4.4-r5475 (WEB) (Sandboxed) (en-GB)
Format detected: GZIP
Binary used: kekaunar
Arguments: (
    "-q",
    "-r",
    "-D",
    "-K",
    "-nq",
    "-o",
    "/Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.kextraction/Operation",
    "/Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz"
)

openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img... Failed! (Data is corrupted)

Extraction to directory "/Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.kextraction/Operation" failed (1 file failed.)

Error code 1

Header:

Hex View  00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F

00000000  1F 8B 08 00 00 00 00 00  02 03 EC 5C 69 54 53 D7  ...........\iTS.
00000010  B7 3F 37 83 84 C9 80 03  D2 3A A1 E2 80 D6 4A 70  .?7......:....Jp

Garbage in question:

Hex View  00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F

00B011B0  00 C0 41 F0 B5 48 21 00  7E 84 07 23 20 66 61 6B  ..A..H!.~..# fak
00B011C0  65 20 63 65 72 74 69 66  69 63 61 74 65 46 57 78  e certificateFWx
00B011D0  30 76 5A EE B8 00 00 00  00 00 00 00 22           0vZ........."

So, I don't think this is a Keka problem, but rather the decompressor it uses. 7zz and kekaunar decompression strategy seems prisoner to the archive file length. Given that there is nothing else to use, since a header is absent (probably because the gzip was created from piped data), this is now problematic.

gzip successfully decompresses, and ignores the 'trailing garbage' with:

# gzip -d openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz
gzip: openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz: trailing garbage ignored

The trailing garbage should not be there, but, it is.

aonez commented 1 month ago

Thanks a lot for the feedback @systemcrash 👍🏼

Files with trailing garbage will be notified but the extraction data will be not removed as incomplete. Here a build with this fixed: https://github.com/aonez/Keka/releases/download/dev-test-builds/Keka-v1.4.5.r5481.7z

systemcrash commented 1 month ago

Thanks @aonez I now got:

Dismissed trailing data, the extraction might be incomplete

OS: Version 13.6.9 (Build 22G830) (x86_64)
Keka: v1.4.5-r5481 (WEB) (Sandboxed) (en-GB)
Format detected: GZIP
Binary used: keka7zz
Arguments: (
    x,
    "/Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz",
    "-snld",
    "-aou",
    "-xr!__MACOSX",
    "-bsp1"
)

7-Zip (z) 24.08 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-08-11 : Modified by aone for Keka
 64-bit locale=en_US.UTF-8 Threads:8 OPEN_MAX:2560

Scanning the drive for archives:
  0M Scan /Users/paul/Downloads/

1 file, 11538909 bytes (12 MiB)

Extracting archive: /Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz

--
Path = /Users/asdf/Downloads/openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img.gz
Type = gzip
Headers Size = 10

  0%

Sub items Errors: 1

ERROR: There are some data after the end of the payload data : openwrt-23.05.5-x86-64-generic-ext4-combined-efi.img

Archives with Errors: 1

Sub items Errors: 1

Error code 343

I figure this is an acceptable compromise - perhaps the trailing garbage strictness could be configured with a boolean?

aonez commented 3 weeks ago

@systemcrash I usually prefer to inform when there's extra data instead of silently accepting that. The user should decide if it is ok or not. I've seen macOS bundled archiver extract a corrupted tar.gz producing incomplete output without any warning.

~That said, a hidden flag its a good option (added in the wiki): SilentlyIgnoreTrailingData~

Will be added in the next revision v1.4.6. You can already set the flag for when the version is ready.

systemcrash commented 3 weeks ago

Is there a good motivation for a hidden flag versus a GUI checkbox?