Isotr0py / pillow-jpegxl-plugin

Pillow plugin for JPEG-XL, using Rust for bindings.
GNU General Public License v3.0
21 stars 6 forks source link

Regression in version 1.2.6 (corrupt data) #57

Closed NetroScript closed 3 months ago

NetroScript commented 3 months ago

Version 1.2.6 seems to have some regression, which causes invalid / incomplete data to be saved.

I am using the following python code with version 1.2.6 (prebuilt manylinux wheel) to convert a larger amount of images to JXL: (Which worked before on 1.2.4)

            # Try to load the image with PIL
            try:
                image = Image.open(image_data)
            except Exception:
                raise Exception("Unsupported image format")

            # The maximum side length of the image should be 4096 pixels, if it is larger, downscale it
            if image.width > 4096 or image.height > 4096:
                image.thumbnail((4096, 4096), Image.Resampling.LANCZOS)

            # Create a name for the stored image
            uuid_name = uuid.uuid4()

            # Save the image to a in memory buffer
            image_memory = BytesIO()
            image.save(image_memory, format="JXL", quality=95, decoding_speed=2)

            # Save the image to the storage folder as JXL 
            with open(f"{storage_folder}/{uuid_name}.jxl", "wb") as f:
                f.write(image_memory.getvalue())

            <...> # more processing which reuses the memory buffer

Seemingly randomly the image files are broken / invalid (about 30% of the images). Opening them in IrfanView on Windows shows almost the entire image besides maybe 10 pixel rows at the bottom. Then IrfanView freezes until killed.

Trying to open the image with sharp (NodeJS library for processing images, internally using libvips) produces the following error:

./lib/jxl/modular/transform/transform.h:86: JXL_FAILURE: Invalid transform ID
./lib/jxl/fields.cc:619: JXL_RETURN_IF_ERROR code=1: visitor.Visit(fields)
./lib/jxl/modular/encoding/encoding.cc:458: JXL_RETURN_IF_ERROR code=1: status
./lib/jxl/modular/encoding/encoding.cc:597: JXL_RETURN_IF_ERROR code=1: dec_status
./lib/jxl/dec_modular.cc:413: JXL_FAILURE: Failed to decode modular DC group
./lib/jxl/dec_frame.cc:308: JXL_RETURN_IF_ERROR code=1: modular_frame_decoder_.DecodeVarDCTDC(dc_group_id, br, dec_state_)
./lib/jxl/dec_frame.cc:644: JXL_FAILURE: Error in DC group
./lib/jxl/decode.cc:1182: frame processing failed

I tested 1.2.5 and 1.2.4 which both are working as expected.

However it is really strange this is happening with version 1.2.6 as looking at the changelog nothing really changed in that version. Was maybe something different in the build process of the wheel?

Isotr0py commented 3 months ago

After #54, use_container and use_original_profile are set to false by default. You can try to add use_container=True and use_original_profile=True to see if it works like version 1.2.5.

If the issue still exists, there may be something wrong due to the environment change of Linux building CI. (Different from windows and macos CI, we use a manually built libjxl for manylinux)

BTW, can you provide the problematic image? So that I can reproduce this issue to figure it out. Thanks!

NetroScript commented 3 months ago

Ok, I did some testing now, and was able to pin point the problem.

As so often it is user error 😅 and it is completely unrelated to this repository.

I tried to reproduce it on my own machine (arch based), and was just not able to.

The code however is running on a Debian (bookworm) server. And I was able to consistently reproduce it there also using pyvips. Opening and reencoding the file with this plugin was working fine, so were other programs. (I identified that IrfanView was hanging because of the SMB datashare, as it tries to load all files in the directory, however the storage folder where the files are stored has ~1 million files -> This is also the reason there is a web viewer using sharp to view them in a browser.)

Meaning libvips was the problem. As it is too outdated on Debian (for sharp), I compiled the most recent version myself. This however uses the libjxl which is available on bookworm, which is 0.7. Checking on the libjxl GitHub I saw that there are 0.7.1, 0.8.3 and 0.10.3 all fixing a certain decoding regression. And there I had my issue.

So the problem on my end was fixed by just compiling libjxl myself, recompiling libvips and sharp.