indygreg / python-zstandard

Python bindings to the Zstandard (zstd) compression library
BSD 3-Clause "New" or "Revised" License
499 stars 86 forks source link

zstandard >= 0.16.0 has block splitting enabled by default for higher compression levels which has different decompression behavior #184

Closed jacoghi closed 1 year ago

jacoghi commented 1 year ago

Hello, zstandard >= 0.16.0 broke compatibility with apps built with versions <=0.15.2. This is not mentioned in the release notes and we just found out about it here: https://github.com/nicoboss/nsz/issues/120. Guess updating the release notes should suffice just to reflect that behavior. Taking into account the biggest change from 0.15.2 to 0.16.0 was moving from zstd 1.4.8 to 1.5.0, I guess this issue might not be completely accurate: https://github.com/facebook/zstd/issues/3110 Thanks

indygreg commented 1 year ago

There should not have been undocumented backwards incompatible changes! And this is the first I'm hearing about any undocumented regressions.

Do you have steps to reproduce the regression? Maybe a snippet of code that changed behavior between 0.15.2 and 0.16.0?

nicoboss commented 1 year ago

If one uses nsz with python-zstandard v0.15.2 or earlier the produced file can be successfully decompressed with multiple 3rd party software. If compressed using python-zstandard v0.16.0 or later decompressing with the same 3rd party software results in corrupted data. This doesn't just affect one single bad third-party software but multiple independent projects.

3rd party software incompatible with files produced using python-zstandard >= 0.16.0:

3rd party software that has no issues with files produced using python-zstandard >= 0.16.0:

None of the above-mentioned software projects experience any issues with files compressed using python-zstandard <= v0.15.2

How to reproduce: Compress a Nintendo Switch game using NSZ v4.1.0 with python-zstandard <= v0.15.2 and try installing it using AtmoXL-Titel-Installer. It will work. Now repeat the same using python-zstandard >= 0.16.0 and the installed game will contain corrupted data. This result either in it crashing on startup or very soon afterwords.

I don't think it's a python-zstandard issue but there probably should be a warning in the release notes that this update has the possibility of breaking compatibility with 3rd party software as small open-source projects lack the resources and expertise to do such research by their own.

indygreg commented 1 year ago

I looked at the code changes in this project between 0.15.2 and 0.16.0 and the only thing that has the potential to introduce backwards incompatible behavior is the upgrade of libzstd from 1.4.8 to 1.5.0. But I see nothing in the upstream release notes to indicate an unwanted change in behavior. That made me suspicious this was a bug in nsv.

So I looked at nsv's use of this package and found the following.

Double Flushing

In SolidCompressor.py, we have the following pattern:

cctx = ZstdCompressor(level=compressionLevel)
compressor = cctx.stream_writer(f)
...
compressor.flush(FLUSH_FRAME)
compressor.flush(COMPRESSOBJ_FLUSH_FINISH)

The use of COMPRESSOBJ_FLUSH_FINISH here is wrong. The docs for ZstdCompressionWriter.flush() state that only the FLUSH_* constants are allowed. COMPRESSOBJ_FLUSH_FINISH is intended for use with ZstdCompressionObj.flush().

However, these constants have overlapping values:

FLUSH_BLOCK = 0
FLUSH_FRAME = 1

COMPRESSOBJ_FLUSH_FINISH = 0
COMPRESSOBJ_FLUSH_BLOCK = 1

So by issuing the 2nd compressor.flush(COMPRESSOBJ_FLUSH_FINISH), you are effectively doing a FLUSH_BLOCK. I haven't tested this, but I suspect that 2nd flush starts a new frame or otherwise causes the compressor to emit more output. This effectively produces a corrupt output stream. This behavior may have changed between zstd 1.4.8 and 1.5.0 and might warrant a report upstream in case they want to treat this scenario as an error in future releases.

nsz should remove the compressor.flush(COMPRESSOBJ_FLUSH_FINISH) line.

I suspect the reason nsz's test cases didn't catch this (assuming they even exist) is because of python-zstandard's current behavior with regards stopping reading at the end of zstd frames. That's an active point of debate and change for this project (see the 0.19 release notes). And failure to catch malformed output [produced by this library] is another reason to read across frames by default.

More Avoidable flush()

I also noticed that nsz calls ZstdDecompressionReader.flush() in multiple locations. This method is a no-op and doesn't make sense on a reader. Those calls can safely be deleted.

What To Do Next

Our FLUSH_* and COMPRESSOBJ_FLUSH_* constants somewhat mirror the values from the zstd C library. But the types aren't identical. We could potentially change the value of the constants inside this Python package so values don't overlap. Then we could have stronger value checking to catch bad flush constant usage and prevent misuse bugs like what we saw here.

But changing the values of these constants would be a backwards incompatible change to this library. We can keep this issue open to consider this change.

nicoboss commented 1 year ago

Thank you so much for reviewing my code - never expected you to do so much for me. I fixed all the issues you found in this commit: https://github.com/nicoboss/nsz/commit/ec7fecc38c0eb931d55cf84a38844264f8d0387f.

Unfortunately this had no effect on the compatibility issue. v0.15.2 and lower is still compatible with all the 3rd party software while 0.16.0 is not. The files produced with and without those flushing fixes are actually even hash identical.

However I did some further testing and found something really interesting. There are two ways to compress games inside NSZ. There is solid compression that just compresses the entire game as one solid compressed block (https://github.com/nicoboss/nsz/blob/master/nsz/SolidCompressor.py). It's what almost everyone uses and the thing that has the compatibility issue. However there is also Block Compression where NSZ splits the file into many small blocks and compresses them separately using a single compressed = ZstdCompressor(level=compressionLevel).compress(buffer) call for each block (https://github.com/nicoboss/nsz/blob/master/nsz/BlockCompressor.py). It turns out that Block compression has no compatibility issues. This is so strange. Maybe there is still something wrong with my SolidCompressor code but everything looks fine to me and it also works perfectly fine in python-zstandard v0.15.2 and older. It's basically just:

cctx = ZstdCompressor(level=compressionLevel, threads=threads)
compressor = cctx.stream_writer(f)
compressor.write(buffer)
(...)
compressor.write(buffer)
compressor.flush(FLUSH_FRAME)

I also did some more research and found out that all incompatible third-party applications use libzstd v1.4.4 from the official zstd repository (https://github.com/facebook/zstd/releases/download/v1.4.4/zstd-1.4.4.tar.gz). This is because for homebrew development they use devkitPro which comes bundled with that liblary. Is python-zstandard v0.16.0 be supposed to be compatible with libzstd v1.4.4?

I also noticed that I can solid compress with 0.16.0 and decompress with 0.15.2 so as long everything uses python-zstandard compatibility is given. With the same arguments 0.16.0 actually produces slightly smaller files. I will definitely continue to investigate this issue in the next few days and let you know when I find out anything new.

nicoboss commented 1 year ago

I did some more testing today and turnes out even if compressed = ZstdCompressor(level=compressionLevel).compress(buffer) is used the issue still occurs as seen here: https://github.com/nicoboss/nsz/commit/726b896e8dc05f1ea1ecf1f71fdc8aa9e2b1987f. This is really strange as block compression does almost the same thing just with small chunks and works fine.

I also did some testing to narrow down if the issue was because of a python-zstandard change or the zstd 1.4.8 to 1.5.0 update. Turns out it's because of the zstd update. I installed python-zstandard v0.16.0 with the pyd files from python-zstandard v0.15.2 which worked. I also tried python-zstandard v0.15.2 with the pyd files from python-zstandard v0.16.0 which had the issue.

Sorry for wasting your time discussing an issue that probably isn't related to any python-zstandard code and likely an issue with zstandard compatibility itself. I highly appreciate all the work you do for python-zstandard and am using it for multiple projects. I will do some further testing in the next few days and let you know if I make any additional findings. Maybe I somehow find a workaround. If not, I will have to fixate python-zstandard to v0.15.2 for the next NSZ release.

nicoboss commented 1 year ago

We finally managed to find and fix this compatibility issue. Zstandard v1.5.0 enables block splitting by default for compression levels 16+. NSZ uses compression level 18 by default. With block splitting enabled zstd occasionally produces unaligned memory sizes during stream decompression. The encrypt function requires aligned memory size for AES128 encryption to work. This issue affects almost all 3rd party software supporting the NSZ file format as nobody expected zstd to suddenly produce unaligned memory sizes despite it not being guaranteed to do so.

Quote from https://github.com/facebook/zstd/releases/tag/v1.5.0 regarding block splitting:

Improving compression ratio via block splitting is now enabled by default for high compression levels (16+). The amount of benefit varies depending on the workload. Compressing archives comprised of heavily differing files will see more improvement than compression of single files that don’t vary much entropically (like text files/enwik). At levels 16+, we observe no measurable regression to compression speed.
The block splitter can be forcibly enabled on lower compression levels as well with the advanced parameter ZSTD_c_splitBlocks. When forcibly enabled at lower levels, speed regressions can become more notable. Additionally, since more compressed blocks may be produced, decompression speed on these blobs may also see small regressions.

The fix of this issue was to modifying the decompression buffer to force an exact aligned size for encryption. The changes that fixes the issue can be found here: https://github.com/dezem/AtmoXL-Titel-Installer/compare/master...nicoboss:AtmoXL-Titel-Installer:bugfix/nsz-unaligned-encryption

Is there a way to manually disable block splitting in python-zstandard? That way I could update to the latest python-zstandard version while still giving users the option to compress without block splitting and so being compatible with 3rd party software that hasn't rolled out this fix yet.

@indygreg Can you please change the title of this issue to something like zstandard >= 0.16.0 has block splitting enabled by default for higher compression levels with has different decompression behavior. None of this was python-zstandard's fault. I still recommend there to be a small note in the python-zstandard changelog. Anyways I'm just so happy that this is finally resolved. Feel free to close this issue. I can't because @jacoghi created it.

indygreg commented 1 year ago

Thanks for the update.

Regarding block splitting, it looks like zstd.h defines the following:

/* ZSTD_c_useBlockSplitter
 * Controlled with ZSTD_paramSwitch_e enum.
 * Default is ZSTD_ps_auto.
 * Set to ZSTD_ps_disable to never use block splitter.
 * Set to ZSTD_ps_enable to always use block splitter.
 *
 * By default, in ZSTD_ps_auto, the library will decide at runtime whether to use
 * block splitting based on the compression parameters.
 */
#define ZSTD_c_useBlockSplitter ZSTD_c_experimentalParam13

On first glance this seems to be what you want. This package doesn't yet expose this setting. If you would like it to do so, please open a new issue requesting that feature. I can't make promises on when I'll have time to implement it.

Anyway, closing this issue upon your request.