miurahr / py7zr

7zip in python3 with ZStandard, PPMd, LZMA2, LZMA1, Delta, BCJ, BZip2, and Deflate compressions, and AES encryption.
https://pypi.org/project/py7zr/
GNU Lesser General Public License v2.1
461 stars 74 forks source link

archiveinfo() UnsupportedCompressionMethodError with BCJ archive #100

Closed ganego closed 4 years ago

ganego commented 4 years ago

Describe the bug py7zr throws UnsupportedCompressionMethodError when opening a file and retrieving .archiveinfo()

To Reproduce Create a BCJ archive: 7z a -mf=BCJ test.7z test.txt (create some test.txt before...) Verify with 7z that it's BCJ - my archive was LZMA:12k BCJ. Open file with 7zr and use archiveinfo().

import py7zr
archive = py7zr.SevenZipFile('test.7z', 'r').archiveinfo()

Crash:

File "...\py7zr\compression.py", line 433, in get_methods_names
    methods_names.append(methods_name_map[coder['method']])
KeyError: b'\x03\x03\x01\x03'

Environment (please complete the following information):

miurahr commented 4 years ago

I cannot reproduce with a file that produced by a procedure you post. A produced files become 'LZMA2:BCJ' in my environment.

I've just create another file by7z a -m1=LZMA -mf=BCJ test.7z test1.txt. Is it produce a same error you got?

Could you upload a test file that reproduce a case? a.zip

ganego commented 4 years ago

With the fixes applied I no longer get any crashes for several different BCJ archives I created.

But I tested another variant that still crashes. See this site: https://sevenzip.osdn.jp/chm/cmdline/switches/method.htm and scroll down to Supported filters for 7z.

BCJ2 crashes with Unknown method b'\x03\x03\x01\x1b'
All other tested filters work fine with either LZMA or LZMA2.

EDIT: Since I just looked a bit through the code I saw py7zr.properties.CompressionMethod. Now compared to that, py7zr.compression.get_methods_names lacks lots of strings, so it will also crash for example for archives created with -m0=COPY for example.

miurahr commented 4 years ago

Python core do not support BCJ2. so py7zr also not support it. If you want to support, please contribute to liblzma, which is linked with python core library.

ganego commented 4 years ago

It should still not crash .archiveinfo(). It should just return the requested information about the archive. If extracting is not supported that is fine, but should raise an exception when extracting, not when requesting archive information. For example I do not need to extract anything, I just need information if an archive is solid.
Thank you

miurahr commented 4 years ago

It is not crash, You should try-catch py7zr.exceptions.UnsupportedCompressionMethodError properly.