matrix-org / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://matrix-org.github.io/synapse
Apache License 2.0
11.82k stars 2.13k forks source link

We should create JPG thumbnails of PNGs, not PNG thumbnails #2045

Open ara4n opened 7 years ago

ara4n commented 7 years ago

Apparently if you upload a PNG it creates a PNG thumbnail. Surely thumbnails should always be lossy JPEGs

APwhitehat commented 7 years ago

I was looking into this issue and I found that I'm not not able to upload png files Synapse (develop branch) outputs this error IOError: cannot identify image file

Full log http://pastebin.com/n2CRXs1p

EDIT : This was a problem with PIL and not Synapse because I couldn't open that png file using PIL

APwhitehat commented 7 years ago

@ara4n Do we really want to have only jpeg thumbnails ? Because in Synapse code there is a variable named output_type being passed around . If we just want "JPEG" thumbnails why have the output_type , we can just implicitly assume it will be "JPEG" . as for forcing "JPEG" editing https://github.com/matrix-org/synapse/blob/develop/synapse/rest/media/v1/thumbnailer.py#L89 as # Force thumbnails to be JPEG output_image.save(output_bytes_io, "JPEG", quality=80) is enough

iav commented 7 years ago

Surely thumbnails should always be lossy JPEGs

But why?

ara4n commented 7 years ago

'cos thumbnails are already inherently lossy, so there's no point in using a typically lossless format like PNG given they are already imperfect. it just wastes bandwidth.

iav commented 7 years ago

Also for black-and-white lineart icons, and initialy tiny number of colors paletted png or gifs? Wrong. jpg for such files can be 10 KB for bad dirty jpg against 700 B png

turt2live commented 7 years ago

No solution is going to solve all problems. I'd imagine there's a higher ratio of images that produce nicer jpg thumbnails than png thumbnails.

Although there's nothing that I can see stopping the use of multiple thumbnail formats, but the determination of which container to use gets fairly arbitrary.

ara4n commented 7 years ago

it's true that for lineart PNGs can be smaller. But i suspect the majority of thumbnails are not for lineart (and i'm not sure we're going to get stuck into trying to guess if an image is lineart in order to pick the right compression algorithm! in other news, it sucks that nobody's written a ubiquitous image format that handles both nicely, yet)

iav commented 7 years ago

Then in most cases post author of source picture already did optimal choice – jpg as default and png for special cases?

iav commented 7 years ago

And what if picture in svg?

ara4n commented 7 years ago

the point is that even if people made the optimal choice with the source imagery, it will still be lossy when rescaled to a thumbnail, at which point (other than perhaps for lineart) using a lossless format is a waste of time. for instance, most OSes store screencaps as lossless PNGs, causing them to be huge if they have any photographic content. So thumbnailing these as PNGs is a waste of space.

I suspect the right solution here is to get the client to gen the thumbnail itself, and the sender can tune as they choose. This is how e2e thumbs and svg thumbs would work if they existed, as the server cannot really generate the thumbs due to lack of data or lacknof security (svg). This is already implemented in Riot; just needs to be hooked up. It does pose the trust issue that clients can lie about thumbnail contents, but this is inevitable with e2e.

We can't use svgs for svg thumbs as if I send you a 10MB SVG you certainly don't want a 10MB thumbnail; a crappy 100KB 640x480 jpeg should be fine.

clokep commented 4 years ago

This came up again (somewhat) in #7586. Note that currently jpeg and webp become jpeg thumbnails and png and gifs become png thumbnails, see:

https://github.com/matrix-org/synapse/blob/789606577ade2335f19e944efcfecfe808519b36/synapse/config/repository.py#L72-L75

ShadowJonathan commented 3 years ago

(Note that GIF thumbnails are being considered in #1278)

bkil commented 3 years ago

Note that if you use a drawing-aware resizing algorithm for lossless sources having a small palette, it will not result in a blurred image, hence a lossless output having a small palette could still be worthwhile. Related issue: https://github.com/matrix-org/synapse/issues/2667

JuniorJPDJ commented 3 years ago

Why not use webp instead of jpeg? It provides better quality in the same size or lower size for the same quality. I'm not 100% sure, but it may also solve part of issues with pngs being smaller than jpegs as webp seem to support color palette.

ShadowJonathan commented 3 years ago

webp might not be supported on all platforms

bkil commented 3 years ago

Nowadays WebP is considered pretty common and offering AVIF may also make sense:

It could be easily polyfilled client side (or if we wanted to support browsers without JavaScript, the server could produce both .webp and .jpeg and serve them as appropriate). You could save a massive amount of bandwidth if >90% of your audience would consume media that consumed 50% less.

Also, as webp has much less header overhead than jpeg, it would actually be feasible to replace blurhash with a small inlined webp as a thumbnail (in whatever efficient encoding)

ptman commented 3 years ago

For web platforms: https://github.com/tomaszs/webp-to-jpg. For others, use native code.

DMRobertson commented 2 years ago

Related:

ptman commented 2 years ago

This seems like it would be simple to change: https://github.com/matrix-org/synapse/blob/6f80fe1e1bbb6cab3ce605b2023e0488e2d80d52/synapse/config/repository.py#L47 . Maybe someone wants to try it out?