bluesky-social / atproto

Social networking technology created by Bluesky
Other
6.9k stars 489 forks source link

AVIF and AV1 support #2908

Open Stryxus opened 4 weeks ago

Stryxus commented 4 weeks ago

Is your feature request related to a problem? Please describe.

Too many social media platforms use JPG and h264, and greatly compressed JPG and h264 at that so, why can we not have AVIF and AV1 support? Basically all browsers support it except for AV1 on Safari on older hardware (WebP for compat perhaps?).

I feel like since BlueSky is still relatively new, this would be a great opportunity to onboard AVIF and AV1. It will make for a much better user experience and yes, while people dont like it when something isnt a JPG or PNG when they right click and save the image, its a very small price to pay and, these days, AVIF is very easily convertible anyway. AV1 is another story, right now at least, its somewhat hard to transcode into a more compatible format but then again, it could deter people from downloading every video on the platform while its still young.

On the storage and bandwidth side of BlueSky, this could potentially save quite a lot of money for the platform.

Describe the solution you'd like

An implementation of AVIF, it should make the platform much faster and efficient, as well as preparing the stage for HDR. This would also be good for if/when users can upload their own GIF's since AVIF supports animations.

Describe alternatives you've considered

For AV1, the only issue is with Safari on 'unsupported' hardware so, VP9 as an alternative for now? This needs discussion.

Additional context

While abandoned now, this is a great comparison tool for AVIF/JPG Youtube uses AV1 for 8k video on supported hardware.

surfdude29 commented 4 weeks ago

I agree with the general direction of your suggestion, but I disagree on some of the specifics. Fwiw here are my own two cents on the question of next-generation image and video formats on Bluesky.

Although AVIF is a great image format, I think JPEG XL (aka JXL) would be a better option for the canonical versions of still images stored on the user's PDS. My understanding is that image processing is done on the user's device before it's uploaded to the PDS, so it needs to be very quick. Given my own casual experience with the two formats, conversion into JXL seems to be much quicker than AVIF, and you still get a decent file size saving vs. JPEG for the same quality.

While JXL support isn't nearly as common as AVIF today, that will only increase over time now that Apple has decided to throw their weight behind it, and hopefully Mozilla with Firefox will come on board soon too. The CDN could serve JXL versions to clients that support it (I don't know this for sure, but I imagine this could include iOS app users on iOS 17 or higher?) and easily generate JPEG versions for clients that don't yet, which is one of the major benefits of JXL.

AVIF could have a role to play on Bluesky for showing animated images. It can offer huge reductions in file size vs. GIFs and it already has wide support, as you say.

For video on the web, AV1 is undoubtedly the future, but unfortunately I just don't think it's a viable option on Bluesky yet. My understanding is that AV1 encoding is still very compute-intensive and thus takes significantly longer than H.264.

And for display on the end user's device, AV1 hardware decoding – vital for performance and battery life – has only really become widely available in the last couple of years. E.g. on the Apple side of things only the iPhone 15 Pro, iPhone 16 models and Macs with an M3 processor or later have support.

So to me AV1 would look to be attractive to implement on Bluesky in a year or two hopefully, when more users are enjoying Bluesky on devices with AV1 hardware decoding, and when AV1 encoding efficiency has improved enough to close the gap with H.264.

Stryxus commented 4 weeks ago

Although AVIF is a great image format, I think JPEG XL (aka JXL) would be a better option for the canonical versions of still images stored on the user's PDS. My understanding is that image processing is done on the user's device before it's uploaded to the PDS, so it needs to be very quick. Given my own casual experience with the two formats, conversion into JXL seems to be much quicker than AVIF, and you still get a decent file size saving vs. JPEG for the same quality.

JXL could indeed work, I didn't mention it because the last I saw, about the beginning of the year, was that Chromium removed it very suddenly because had some sort of dispute and a community outcry began. Since then Iv personally been invested in AVIF so 🤷.

For video on the web, AV1 is undoubtedly the future, but unfortunately I just don't think it's a viable option on Bluesky yet. My understanding is that AV1 encoding is still very compute-intensive and thus takes significantly longer than H.264.

And for display on the end user's device, AV1 hardware decoding – vital for performance and battery life – has only really become widely available in the last couple of years. E.g. on the Apple side of things only the iPhone 15 Pro, iPhone 16 models and Macs with an M3 processor or later have support.

So to me AV1 would look to be attractive to implement on Bluesky in a year or two hopefully, when more users are enjoying Bluesky on devices with AV1 hardware decoding, and when AV1 encoding efficiency has improved enough to close the gap with H.264.

This is a good point, I wasnt 100% sure if the transcoding was done on client or server. Ye, only Apple M3 and up as well as NVIDIA's RTX 4000 series support AV1 encoding. The libraries for encoding AV1 seem vast though too. Iv tested NVIDIA's NVENC AV1 on a RTX 4070 Super and it looks to be about half as efficient but MUCH faster than AOM or SVT so, it might be a little bit off still but, good to keep up with it due to the benefits.

bleonard252 commented 2 weeks ago

AVIF seems better suited for GIF-like images. I think JPEG-XL would be optimal for still images, although it doesn't seem to have nearly any support on the web.

Tamschi commented 2 weeks ago

It depends on the content. JPEG-XL is better for (perceptually) lossless compression of artistic content and pixel animations. AVIF is likely to outperform it for lossy compression and reaction GIFs though (since it's basically a video compression scheme).

Afaict, what it comes down to is that JXL has better high-quality compression but no motion vector gradients.


A note on JXL->JPEG: For that to be efficient, you have to use compatibility mode which hurts JXL compression (and possibly accuracy?). It may be a good idea to use full-featured JXL for anything ATProto and start with the compatibility mode only in the CDN (since that does a lossy recompression anyway).

jonnyawsom3 commented 1 week ago

Just stumbled across this so I'll throw my two frames into the topic.

AVIF currently can just be a re-packaged AV1 video, and even the JXL devs agree that videos should stay as video and not get twisted into an image format like GIF or JXL if possible. However, given how many GIFs there are, usually optimized for its 8bit pallete using dithering and transparency. Video compression tends to handle it poorly, and only lossless compression can avoid serious color degradation and size increase. Not to mention pixel art getting scaled up 8x or more where lossless is the only option, and JXL happens to do extremely well.

As for JXL to JPEG, currently that hasn't been implemented in libjxl, so the suggested method is encoding with jpegli (Based on JPEG XL with 35% smaller sizes than normal), and then using the JXL transcoding for a further 20% filesize decrease with on-demand transcoding back to the original JPEG. It gives higher quality than normal JPEGs, both from jpegli being more advanced (Even up to 10bit in a standard 8bit file), and the transcoded JXL using floats for more accurate decoding. (A lot have issues with the pixel hash not matching, because they don't realise it's actually closer to the original image than the JPEG) Transcoding a JPEG from Bluesky takes 160ms, transcoding back to JPEG takes 50ms on my Ryzen 1700 Transcoding a 20MP camera image is 500ms, transcoding back to JPEG is 160ms, so it scales well with a constant 20%+ saving

This isn't mentioning the possibility of using JXL's progressive nature to store single files for multiple sizes either, simply using partial requests to save data and sending the rest when needed. We've had images decode at just 0.14% loaded. Plus JXL could always be added to the apps if Chrome still lags behind (Firefox are working with the devs on a Rust version), but that's for the long term.

Anyway, back on topic... Videos! (My 1700 cries at 1080p AV1)