codingjoe / django-pictures

Responsive cross-browser image library using modern codes like AVIF & WebP
BSD 2-Clause "Simplified" License
248 stars 20 forks source link

WEBP thumbnails with text are blurred compared to the original PNG #104

Closed karelklic closed 1 year ago

karelklic commented 1 year ago

Hello,

thank you for an excellent package.

I noticed that the WEBP thumbnails are somehow more blurred than images resized previously by other means (e.g., manually in GIMP). This is visible with a lossless (PNG) image containing text.

Since WEBP also seems lossless, I suspect that the image.thumbnail(size) call in pictures/models.py causes the issue. The thumbnail method uses a bicubic resampling filter by default.

Does it make sense to take the resampling filter from the configuration?

image.thumbnail(size, resample=conf.get_settings().RESAMPLING_FILTER)

Or set the filter to Resampling.LANCZOS?

image.thumbnail(size, resample=Resampling.LANCZOS)

codingjoe commented 1 year ago

Hi @karelklic 👋

Thank you for reaching out and writing such an excellent ticket. I am curious about what kind of blur you are describing. I know that we had issues with state leakage once that caused low-res images to be upscaled. Do you happen to have a sample image?

Other than that, I am usually hesitant to add too many settings. I believe this package should deliver the best possible results without a PhD in image processing.

Therefore I'd probably try to change the result to the best possible default for web imagery.

I tried to keep the image processing extendible, for or advanced use cases. That bit might be deserving of some better documentation though.

Ping me, about the sample image and will help to further investigate.

Cheers, Joe

karelklic commented 1 year ago

I investigated the issue further. The resampling filter is not causing the problem.

The root cause of blurriness is that WEBP produced by Pillow is lossy by default. Therefore lossless PNG images become lossy WEBP thumbnails.

It would be better if Django-Pictures saved lossless WEBP when resizing PNG images. When a user chooses PNG instead of JPG, it's often because the image contains diagrams and infographics with lots of text that doesn't look well with lossy compression. The thumbnails then don't look well lossy-compressed too.

Here is a script that allows comparing different outputs from Pillow for reference:

#!/usr/bin/env python3
from PIL import Image

def create_thumbnail(image_path, thumbnail_path, thumbnail_size, resample, **kwargs):
    # Open the original image.
    image = Image.open(image_path)

    # Create the thumbnail in-place.
    image.thumbnail(thumbnail_size, resample=resample)

    # Save the thumbnail.
    image.save(thumbnail_path, **kwargs)

if __name__ == "__main__":
    create_thumbnail("Original.png", "Thumbnail_bilinear_lossy.webp", (1000,500), Image.Resampling.BILINEAR, format="WEBP")
    create_thumbnail("Original.png", "Thumbnail_bicubic_lossy.webp", (1000,500), Image.Resampling.BICUBIC, format="WEBP")
    create_thumbnail("Original.png", "Thumbnail_lanczos_lossy.webp", (1000,500), Image.Resampling.LANCZOS, format="WEBP")
    create_thumbnail("Original.png", "Thumbnail_box_lossy.webp", (1000,500), Image.Resampling.BOX, format="WEBP")
    create_thumbnail("Original.png", "Thumbnail_hamming_lossy.webp", (1000,500), Image.Resampling.HAMMING, format="WEBP")

    create_thumbnail("Original.png", "Thumbnail_bilinear_lossless.webp", (1000,500), Image.Resampling.BILINEAR, format="WEBP", lossless=True)
    create_thumbnail("Original.png", "Thumbnail_bicubic_lossless.webp", (1000,500), Image.Resampling.BICUBIC, format="WEBP", lossless=True)
    create_thumbnail("Original.png", "Thumbnail_lanczos_lossless.webp", (1000,500), Image.Resampling.LANCZOS, format="WEBP", lossless=True)
    create_thumbnail("Original.png", "Thumbnail_box_lossless.webp", (1000,500), Image.Resampling.BOX, format="WEBP", lossless=True)
    create_thumbnail("Original.png", "Thumbnail_hamming_lossless.webp", (1000,500), Image.Resampling.HAMMING, format="WEBP", lossless=True)

    create_thumbnail("Original.png", "Thumbnail_bilinear.png", (1000,500), Image.Resampling.BILINEAR, format="PNG")
    create_thumbnail("Original.png", "Thumbnail_bicubic.png", (1000,500), Image.Resampling.BICUBIC, format="PNG")
    create_thumbnail("Original.png", "Thumbnail_lanczos.png", (1000,500), Image.Resampling.LANCZOS, format="PNG")
    create_thumbnail("Original.png", "Thumbnail_box.png", (1000,500), Image.Resampling.BOX, format="PNG")
    create_thumbnail("Original.png", "Thumbnail_hamming.png", (1000,500), Image.Resampling.HAMMING, format="PNG")

(I'm sending the sample original image and two thumbnails to your email for easy comparison.)

The solution could be to extend SimplePicture.save to distinguish between lossy and lossless format. I think it could be as simple as:

def save(self, image):
    with io.BytesIO() as file_buffer:
        img = self.process(image)
        save_options = {}
        if image.format == "PNG":
            save_options = {"lossless": True}
        img.save(file_buffer, format=self.file_type, **save_options)
       ...

Pillow documentation for Image.save states that unrecognized options are silently ignored, so the lossless option can be provided regardless of self.file_type value.

codingjoe commented 1 year ago

Hi @karelklic,

Thank you for your detailed response. I appreciate the effort in including a code sample to reproduce the problem.

I am hesitant to override Pillows default values. Those people usually know that they are doing, especially when it comes to tradeoffs between quality and size. Furthermore, I believe the assumption, that a user "knows" to use a PNG for looseless content questionable. Users have shown time and time again that Murphy's law remains solid.

However, giving you the developer the choice is important. Version 1.1 comes with a new setting that gives you the ability to swap the image processor. Beware, though, it's synchronous.

I am also not against adding exposing Pillow's Image.save **params via a setting. This setting could be per file type. We enable FILE_TYPES to be a dictionary, like so:

# settings.py
PICTURES = {
    # …
    "FILE_TYPES": {
        # Docs: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html
        "WEBP": {"lossless": True}
    },
}

That being said, I don't think it will solve your particular issue, since it won't allow you to handle cases differently based on the user input. In those cases, a custom processor seems more reasonable. We could also consider making the SimplePicture class swappable.

Cheers! Joe