internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.21k stars 1.37k forks source link

Add support for Webp/AVIF to coverstore #7250

Open cdrini opened 1 year ago

cdrini commented 1 year ago

These next-gen image formats offer great compression ratios over traditional jpg. It would not only be smaller on disk, but would improve page load times of open library.

Describe the problem that you'd like solved

Proposal & Constraints

Additional context

Quote @onnotasler :

As far as I see, we are using JPEG as image file format at the moment. Would it be worthwhile to consider a switch to AVIF? As far as I am aware, most modern browsers support AVIF graphic file format nowadays and it offers similar image quality at lower image size.

Here is an example of a workflow to improve size to quality even more: https://www.tempertemper.net/blog/avif-and-webp-are-not-always-better-than-png-and-jpg

Stakeholders

@onnotasler

davidscotson commented 1 year ago

I'm a little suprised that it's not better supported in standard tools, but the roughly 20-25% potential improvements possibly without moving from JPEG as a format were written up back in 2017 in a python context by Yelp here:

https://engineeringblog.yelp.com/2017/06/making-photos-smaller.html

Couple of key parts:

  • Changes to Pillow settings were responsible for about 4.5% file size savings
  • Dynamic Quality was responsible for about 4.5% file size savings
  • Switching to the mozjpeg encoder was responsible for about 13.8% file size savings

(note: I edited the text slighly to make the saving amounts clearer and skipped one bullet point which isn't relevant. Note: Open Library currently uses the same Pillow code for uploads of images and I assume imports)

and

Support for more modern content types like WebP or JPEG2k is certainly on our radar. Even once that hypothetical project ships, there will be a long-tail of users requesting these now-optimized JPEG/PNG images which will continue to make this effort well worth it.

Currently OpenLibrary uses quality 90, and Yelp started from a baseline of 85, which is still higher many social sites, so even a simple static switch to 80 or 85 might save substantial file storage and transfer size at no perceptible quality loss even without any clever quality level selection.

Seems like OpenLibrary and the InternetArchive must be serving a crazy amount of JPEGs so the above is probably worth looking into.