kyz / libmspack

A library for some loosely related Microsoft compression formats, CAB, CHM, HLP, LIT, KWAJ and SZDD.
https://www.cabextract.org.uk/libmspack/
169 stars 45 forks source link

Regression when extracting cabinets using "-F" option #22

Closed austin987 closed 6 years ago

austin987 commented 6 years ago

Hi,

I'm the maintainer for winetricks. I had a bug report recently (https://github.com/Winetricks/winetricks/issues/1120) about something failing to extract. Upon further investigation, seems to be a regression in cabextract:

d3d899be6100c3399fc8c915ca768f3aafe6d9a6 is the first bad commit
commit d3d899be6100c3399fc8c915ca768f3aafe6d9a6
Author: Micah Snyder <micasnyd@cisco.com>
Date:   Sun Sep 16 18:19:11 2018 -0400

    Added CAB decompression parameter MSCABD_PARAM_SALVAGE which makes a best effort to extract as many files as possible from damaged, mangled, malformed or otherwise non-standard CAB archives.

:040000 040000 cd5cb2957dfa85165f28fbe1d6fe462a2906bc7c be39207070d5060291a5b1e3885c0228dbba7ebd M  libmspack

Before this patch:

Executing cabextract -q -d /home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha -L -F i386/mspatcha.dl_ /home/austin/.cache/winetricks/win2ksp4/W2KSP4_EN.EXE
Executing cabextract -q --directory=/home/austin/.wine/dosdevices/c:/windows/system32 /home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha/i386/mspatcha.dl_
Using native,builtin override for following DLLs: mspatcha

After this patch:

Executing cabextract -q -d /home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha -L -F i386/mspatcha.dl_ /home/austin/.cache/winetricks/win2ksp4/W2KSP4_EN.EXE
/home/austin/.cache/winetricks/win2ksp4/W2KSP4_EN.EXE: WARNING; non-maximal data block
/home/austin/.cache/winetricks/win2ksp4/W2KSP4_EN.EXE: WARNING; non-maximal data block
Executing cabextract -q --directory=/home/austin/.wine/dosdevices/c:/windows/system32 /home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha/i386/mspatcha.dl_
/home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha/i386/mspatcha.dl_: no valid cabinets found
------------------------------------------------------
Note: command cabextract -q --directory=/home/austin/.wine/dosdevices/c:/windows/system32 /home/austin/.wine/dosdevices/c:/windows/temp/_mspatcha/i386/mspatcha.dl_ returned status 1. Aborting.

The patch also breaks the tests for me:

make: Entering directory '/home/austin/src/cabextract/cabextract/test'
../cabextract -l case.cab >case.list
../cabextract -L -l case.cab >>case.list
../cabextract -d DIR/PATH -l case.cab >>case.list
../cabextract -d DIR/PATH -L -l case.cab >>case.list
../cabextract -l dirwalk-vulns.cab >dirwalk-vulns.list
../cabextract -e koi8-ru -l encoding-koi8.cab >encoding-koi8.list
../cabextract -e iso-8859-1 -l encoding-latin1.cab >encoding-latin1.list
../cabextract -e sjis -l encoding-sjis.cab >encoding-sjis.list
../cabextract -l search.cab >search.list
../cabextract -l simple.cab >simple.list
../cabextract -l utf8-stresstest.cab >utf8-stresstest.list
../cabextract -p mixed.cab >mixed.test
stdout(mszip.txt): No such file or directory
stdout(lzx.txt): No such file or directory
stdout(qtm.txt): No such file or directory
make: *** [Makefile:27: mixed.test] Error 1
make: Leaving directory '/home/austin/src/cabextract/cabextract/test'

With 5c1b259a9f719f33c2a31d7b11bb4b198844ca6a (parent commit), winetricks works and cabextract tests pass.

Please let me know if I can provide any further information.

kyz commented 6 years ago

Are you using a specific architecture? I've downloaded W2KSP4_EN.EXE and tested cabextract on both this and test/mixed.cab but can't replicate your findings with either make clean all CFLAGS='-march=i686 -m32' or make clean all CFLAGS='-march=x86-64 -m64'

You could also try building with debug mode enabled (e.g. CFLAGS=-DDEBUG) to see if any useful messages are printed.

On 32-bit i686:

$ make clean all CFLAGS='-march=i686 -m32' >/dev/null
ar: `u' modifier ignored since `D' is the default (see `U')
$ objdump -f cabextract

cabextract:     file format elf32-i386
architecture: i386, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x00000e60

$ sha256sum W2KSP4_EN.EXE test/mixed.cab 
167bb78d4adc957cc39fb4902517e1f32b1e62092353be5f8fb9ee647642de7e  W2KSP4_EN.EXE
0ce0b55fe705b744d41bb361170c0467db30da0c7f9bdd386d5dade71a78e171  test/mixed.cab
$ ./cabextract -t W2KSP4_EN.EXE test/mixed.cab | tail
  i386/winntupg/msvcrt.dll  OK                 ba7be6f92680b28b9031170659fd222d
  i386/winntupg/ntdsupg.dll  OK                d01008c247ed6977e9aececee27fcd84
  i386/winntupg/ms/modemshr/netmshr.inf  OK    0e91f4785ca6521cd0441d538f5ad50c
  i386/winntupg/ms/modemshr/netsrdr.inf  OK    072de7893a65d1216dc3138605539c42
Testing cabinet: test/mixed.cab
  mszip.txt  OK                                940cba86658fbceb582faecd2b5975d1
  lzx.txt  OK                                  703474293b614e7110b3eb8ac2762b53
  qtm.txt  OK                                  98fcfa4962a0f169a3c7fdbcb445cf17

All done, no errors.
$ ./cabextract -L -F i386/mspatcha.dl_ W2KSP4_EN.EXE 
Extracting cabinet: W2KSP4_EN.EXE
  extracting i386/mspatcha.dl_

All done, no errors.
$ sha256sum i386/mspatcha.dl_
0a2ef27944bdfdb07c79d51f5f5fb4b3cc46f0528d10eb25b40d9c6b756cf82e  i386/mspatcha.dl_

On 64-bit x86_64:

$ make clean all CFLAGS='-march=x86-64 -m64' >/dev/null
ar: `u' modifier ignored since `D' is the default (see `U')
$ objdump -f cabextract

cabextract:     file format elf64-x86-64
architecture: i386:x86-64, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x0000000000001570

$ sha256sum W2KSP4_EN.EXE test/mixed.cab 
167bb78d4adc957cc39fb4902517e1f32b1e62092353be5f8fb9ee647642de7e  W2KSP4_EN.EXE
0ce0b55fe705b744d41bb361170c0467db30da0c7f9bdd386d5dade71a78e171  test/mixed.cab
$ ./cabextract -t W2KSP4_EN.EXE test/mixed.cab | tail
  i386/winntupg/msvcrt.dll  OK                 ba7be6f92680b28b9031170659fd222d
  i386/winntupg/ntdsupg.dll  OK                d01008c247ed6977e9aececee27fcd84
  i386/winntupg/ms/modemshr/netmshr.inf  OK    0e91f4785ca6521cd0441d538f5ad50c
  i386/winntupg/ms/modemshr/netsrdr.inf  OK    072de7893a65d1216dc3138605539c42
Testing cabinet: test/mixed.cab
  mszip.txt  OK                                940cba86658fbceb582faecd2b5975d1
  lzx.txt  OK                                  703474293b614e7110b3eb8ac2762b53
  qtm.txt  OK                                  98fcfa4962a0f169a3c7fdbcb445cf17

All done, no errors.
$ ./cabextract -F i386/mspatcha.dl_ W2KSP4_EN.EXE 
Extracting cabinet: W2KSP4_EN.EXE
  extracting i386/mspatcha.dl_

All done, no errors.
$ sha256sum i386/mspatcha.dl_
0a2ef27944bdfdb07c79d51f5f5fb4b3cc46f0528d10eb25b40d9c6b756cf82e  i386/mspatcha.dl_
kyz commented 6 years ago

You should thank your contributor Goatroth, their words are exactly what I needed to hear: " it's always produced corrupt cabs (but only when using -F)."

Using -F usually means decompressing and throwing away data (skipping data) in a folder to get to the offset of the file you want to extract. Almost all other invocations extract all files in order, so never skip data.

If this doesn't happen, data is extracted from the folder, but it's probably the wrong data, and there are no post-decompression checksums.

A line of code was added by the 'salvage mode' patch to deliberately not do this essential step to get to the right offset, if the folder type was LZX. I called this out when first looking at code, but didn't notice it was there when accepting the patch, and didn't notice it while finding numerous other things to fix and release. There was no regression test for this use case.

So, I've fixed the bug, and added a regression test. Thank your submitter for reporting this, and thanks to you for following up on it.

austin987 commented 6 years ago

Thanks for the super quick fix @kyz, fixed by 2d86d4e70026cd03730ce0b00b12579c2e21620a