Closed dakom closed 3 years ago
this is a bit of a nightmare on the encoding side... took me the better part of a day trying to sort it all out. long story short:
most browsers play vp9/webm with alpha, but Apple only supports hevc.
here's a ffmpeg example command for the webm side:
ffmpeg -i example.gif -c vp9 -b:v 0 -crf 26 -pix_fmt yuva420p example.webm
the trick there is the pix_fmt (which actually may be auto-detected)...
trying to set that with libx265 to produce the hevc simply doesn't currently support any alpha pixel format
it seems like there may be a supported ffmpeg encoder for Apple hardware, but I couldn't test it, and this won't help us since the media server isn't running on Apple hardware.
more info: https://stackoverflow.com/questions/61661140/convert-webm-to-hevc-with-alpha
I ended up doing the native gif parsing thing. Closing this, with some notes:
So ultimately, it would be nice to support transparent video stickers, but it isn't necessary or ideal to deprecate gif
I still need to clean things up and add loop controls etc., but for the sake of reference, here is working animation code (writes to a canvas context... once the dust settles it's just blitting ImageData and runs quickly): https://github.com/ji-devs/ji-cloud/tree/adc0712b2a4247a64c863092a967b2dfb1cad3cd/frontend/apps/crates/entry/module/legacy/play/src/base/design/sprite
Is your feature request related to a problem? Please describe.
The current planned animation kinds are Gif and Spritesheet. Neither of these are being used yet, and they were intentionally placeholders.
As I am building the legacy player, I had to add gif animation support. When it's simply a matter of showing the gif, it's not a problem. But being able to pause/play means controlling the rendering, which ultimately means rendering it raw. This leads to a real performance problem with decoding. It can easily take 10 seconds to decode gif frames, even with an optimized wasm parsing library.
Of course, the cost is amortized by only decoding a frame at a time as-needed, but my rough benchmarks show this alone to cost a dozen milliseconds or so, which means it would break the user experience whenever updates need to happen for multiple animations at once. This depends both on the number of animations on the page as well as the frequency of their timing, which makes the whole approach hard to predict and inherently unscalable.
This only affects the Gif option. Although spritesheets haven't been spec'd yet, they tend to follow the same basic idea of one source image, either with individual cells representing each frame in the simple case, or cells representing parts that are composed together at runtime in the complex case. Either way, there is not the cost of decoding and writing pixel data.
As an aside- the gif format is also severely limited for artistic expression, due to the tiny palette size.
Describe the solution you'd like
Gif should be completely deprecated. Instead, a new Video type should be added in its place.
When the user uploads a gif, the backend should use ffmpeg to convert it into video with transparency. This currently means dual formats for cross-browser support: VP9 and HEVC.
I believe this is the solution taken by large companies such as Facebook too
We already have the pipeline in place to support this, via the media transcoding, which is pretty awesome since building this out from scratch now would be painful (lots of moving parts- including the requirement to notify the frontend when it's ready via an event)
Although video is also expensive to decode, it benefits from the browser internals. In many cases it can use hardware acceleration or at least benefit from a separate decoding thread in the browser. Ultimately, it should perform much better than manually decoding and displaying gifs
Other nice benefits are smaller filesizes and we can support actual video stickers, not limited to the gif palette restriction
Describe alternatives you've considered
The decoding could happen in a separate thread. Technically this would solve the problem too, but native multithreading with Rust syntax isn't yet supported, and web workers are annoying to deal with.
While this would inherently solve the problem of blocking the UI, it would require careful design in order to get the display timing right. my rough thought here would be to have 2 threads- one which the UI talks to in order to get the next frame and the other which decodes all at once and sends to that thread as soon as each frame is ready.
Ultimately this would turn into a hard problem, and isn't worth the effort imho
I also tried pre-decoding the gif into a raw binary blob, and compressing it with zlib. The filesize was larger than the original gif which is also quite large
Another idea might be to convert it to a simple spritesheet, but this is still much less efficient than video. In fact I would argue that most simple spritesheets should be video too, and the spritesheet format should be preserved for the "complex" kind (like Spine or DragonBones)
Additional context
This requires some coordination so assigning a few different people to it. Also marking
Future
since it's not a current requirement