tukaani-project / xz-java

XZ for Java
https://tukaani.org/xz/java.html
BSD Zero Clause License
23 stars 14 forks source link

[Bug]: Decompression with xz-java results in CorruptedInputException, while decompression with xz works just fine #8

Closed jvanheesch closed 2 months ago

jvanheesch commented 8 months ago

Describe the bug

Decompressing the value {93, 0, 0, 0, 2, 6, 0, 0, 0, 0, 0, 0, 0, 0, 57, -104, 73, -2, -17, -28, -15, 86, -9, -33, -1, -3, -111, 16, 0} - which is the result of compressing the string "sample" using lzma-js, as demonstrated here - results in CorruptedInputException, while decompression with xz works flawlessly (as does decompression with lzma-java and lzma-js).

Reproducer: https://github.com/jvanheesch/lzma

Version

1.9

Operating System

macOS 13.4.1 (22F82)

Relevant log output

No response

JiaT75 commented 8 months ago

Hello! Thanks for the detailed bug report.

The issue here is that the failing "sample" encoding has the uncompressed size set in the header (0x6) and also uses an End of Payload Marker. This was originally done intentionally since XZ Utils also rejected such files until 5.2.6, when it was noticed that LZMA SDK (the 7-Zip implementation) allows such files and is documented to do so. If you test this file on xz <= 5.2.5, then it should also reject this file.

So XZ for Java needs to support these files as well and we will keep this Issue open until that is added. Thanks for bringing this to our attention!