Add spritesheet animation handling. Also fix several minor issues

Codas commented 2 months ago

This also addresses issues #266, #268, #269, #270, #271, #274 and #276 and can quite dramatically increase performance in cases where no filters or masks are applied to the effect.

Previously, each effect would essentially need 4 draw calls. One to paint the sprite texture, one for the always present alpha filter, and two more for the always present filters on the container for the faded and desaturated effect when viewing effects intended only for other players.

By not creating theses filters at all if they are not needed (like already done for the masks), these draw call overhead can be elimated. This not only removes the 3 uneccessary draw calls for the filters, but allows the effects texture drawing to be batched, for (for most systems) up to 16 effects in one draw call.

I've added a stresstest example to show the difference, though only the first part will be playable in the current version of sequencer.

A word on json spritesheets. I've included one free, CC0 animation as a spritesheet, flipbook and VP8 video file in this PR (flipbook version is already available in the other examples PR) to easily compare the performance of these different versions. Spritesheets have the simple but in some cases quite substantial advante of not needing seprate video decode step, as all frames of the animation are encoded in one base texture. Furthermore, all effects playing the same animation share the exact same underlying base texture, wich means that duplicates of this animation have practically no inpact on performance at all. In the stresstest example, all 600 fire animations can be drawn in one single draw call using the spritesheet, which makes spritesheets predestined to be used for persistent animations that are duplicated quite often, like effects on tokens or small fire effects that might be placed all over a map.

As for the general reasoning behind these somewhat extensive changes:

I've decided to separate the asset/texture loading and the detailed sprite handling to another class, called SequencerSpriteManager. The main reason was, that I could not get the spritesheet unloading to reliably work in all cases because sometimes the files would destroy the shared spritesheet texture that was still in use by other effects. Instead of litering the destroy dependencies and SequencerFile internal asset handling code with different special casing for spritesheets I thought it best to handle asset loading and unloading in one place and one place only.

Another reason for the added class is that starting with PIXI v8 (which will be introduced in Foundry VTT 13), sprites can no longer have children. Currently, a sprite is sometimes used as container and sometimes as a sprite displaying a texture itself. Again, changing this without also breaking some existing effects, especially for those using animateProperty or loopProperty turned out to be quite complex. A seprate class that extends a simple container but imitates a sprite with its properties seemed the best solution to me to handle this in a backwards compatible way. At least as backwards compatibile as I could make it.

Of course, this had some far sweeping consequences for the CanvasEffect class itself, but it also allowed me to take a stab at fixing some of the existing issues in regards to animation looping and some strangeness when using texts on top of effects and how it affects the sprites anchor.

I have done my best to verify that everything still works as before. All the examples / test cases added were used to compare the current sequencer version with this PR to make sure no new defects are introduced.

Codas commented 2 months ago

I'll have to review this a bit more over time (and when I have some more time after my move)

Yes, please take all the time you need. I'm also on vacation starting next week and will not be near a pc for two weeks. I just wanted to get this PR ready before then :)

I will also try and test this in some more real world and complex animations. Maybe I'll also ask vauxs if he has some crazy stuff lying around :)

Codas commented 2 months ago

I also finally managed to convert some of the jb2a assets to spritesheets to highlight the advantages of supporting spritesheets. This is about 500 sequencer effects of unique 400x400 animated assets running at once at 80fps on a MacBook (M1 Max).

https://github.com/user-attachments/assets/7fd37289-de8b-4331-bfc8-797f40948d6e

Haxxer commented 2 months ago

Woah, nice. Is that with the pre-multiply fix enabled or disabled? It's called the pixi fix in Sequencer's settings, it fixes the black edges around transparent effects

Codas commented 2 months ago

Huh, indeed.... The animations do look off when compared with the jb2a page. I never noticed that 🤦 Both do, because I thought my initial attemts at packing the sprite sheets looked way to bright compared to the jb2a assets which led me to premultiply alpha values when packing the sprites -.- But turns out the jb2a assets without the fix enables are just too dark.

This is probably how it should look like (sprite on the left, fix enabled webm on the right)

(yes, basis encoded sprites look a bit worse, but thats also b/c of the quality settings I used and not really noticable in actual play)

Thank you for bringing this to my attention! Now I'll just have to re-encode all those files again, without premultiplied alpha this time 😅

Codas commented 2 months ago

I still need to re-enable effect masking for this code. I'm on it :)

Haxxer commented 3 weeks ago

Will dd64ac0 be merged into #313?

Codas commented 3 weeks ago

No need, it's already included. I first prototyped the tiling and batching implementation in this branch and then once I was somewhat happy copied these changes over to the current sequencer master, tested it much more and fixed some issues with scaling for tiling textures for example. dd64ac0 is simply those fixes re-applied to this branch.

I'm sorry for the confusion, this PR here (spritesheets) is still very much in a draft state. I'm currently working on streamlining many of the changes to make them less verbose and complicated. After that is done I plan to add some way to make this actually usefull for the typcal people using sequencer and not only in a hypothetical "what if everything was available as spritesheets" kind of situation.

Codas commented 2 weeks ago

So, this should now be in a much better state. Refactord the sprite-manager class to not have so many redundant classes and some other improvements to reduce boilerplate. The last commits also introduced type annotations for new classes to make the code a bit easier to follow.

Most importantly though, the last commit introduces Just In Time compilation of video files to spritesheets! Webm video files (vp8 and vp9 codecs) are supported for now. This is quite a bit of code and some blobs, so let me explain:

src/lib/insprecot-js contains a modified and updated inspector-js library. This library is used for "demuxing" of the webm video files. Demuxing is the process of extracting individual streams and frame data information from a media container (webm in our case). This is needed to feed the individual frame data packets to the new(ish) "MediaCodecs" web API which enables for native, hardware accelerated decoding of video into individual frames!.

src/lib/basis-encoder contains the wasm bundle of the BASIS universal media encoder. This encoder supports encoding image data into a compressed texture format that can be directly uploaded to the GPU without decompression. Compression factor is static and 1/4 of the plain RGBA texture[^1]. We limit possible spritesheets to 8k by 8k in size, which is a guaranteed texture size supported by webgl2.

The one added library (potpack) is very tiny (<1 kb minified I think) and used to pack the individual frame data efficiently into the big spritesheet texture. potpack/index.js for reference

The process for spritesheet conversion currently is as follows:

All effects are initially just displayed and rendered as normal with a video texture. If a persisted effect is encountered (that has no end after completion flag), the spritesheet creation process is started. This uses a webworker worker pool for decoding, packing and compression of the textures. The pool size is (<number of logical cores> - 2) / 2[^2]. For each uniqe animation, only one job is passed to the workers, meaning 100s of animation with the same file only create one spritesheet of course.

After the Spritesheet has been generated, the effect sprite is then replaced with the newly created spritesheet and speed, currentTime etc. settings are re-applied as best as possible, but minor jumps in the animation are probably unavoidable in this process.

The result is: Persisted animations are converted to spritesheets on a per-use basis and once converted are basically free to render. This is great for sytems like pf2e, where many status effects might apply to multiple tokens at once. For example basically every token being frightened inside a dragons fearful presence aura.

Enough words, here's a video of this process in action: https://github.com/user-attachments/assets/de2e5ea8-464a-484d-a205-7e370dddf183

When the effects are first created, the animation playback is very choppy, basically 1 frame per second. Canvas framerate also drops considerably because of texture pressure. After some time, spritesheets are generated one-by-one and replaced. Animations using this effect are intantly running with 30fps instead. After all animations have been replaced, canvas FPS increases to a stable 120fps again! The animated color filter was just included to show off 😛

[^1]: GPU Texture compression is basically needed to support any reasonably sized spritesheets since those can get quite large. 8k by 8k images have fixed memory size in RGBA 8bit format of 256MB. 4-5 of these textures and some systems might get memory constrained! This encoder lessens this issue by making even those textures "only" 64MB in size. With the expectation of at least 2GB of free memory for animation data, this means we can upload up to 32 max-size spritesheet textures instead of only 8!

[^2]: Reasoning the woker pool count is that 2 cpu threads should probably always be 100% available for the browser, renderer and other processes. Of the remaining, free threads only half are actually used to not overload the system and have a very high probability, that "strong" cpu cores can be used even in systems with a split of powerful and energy-efficient cores. This includes basically every ARM system or modern intel CPUs. Or in case of hyperthreading that one worker can really run on one CPU, as hyperthreading seems to not help much with this workload. This includes.. basically every other CPU out there with few exceptions.

LukeAbby commented 2 weeks ago

You mention:

This is needed to feed the individual frame data packets to the new(ish) "MediaCodecs" web API which enables for native, hardware accelerated decoding of video into individual frames!. Is there a fallback for stuff like VideoDecoder when it's not available? I think a non-negligible number of people run Foundry on non-https sites.

Codas commented 1 week ago

You mention:

Is there a fallback for stuff like VideoDecoder when it's not available? I think a non-negligible number of people run Foundry on non-https sites.

I don't think there is... There might be a way to do it in chrome, but much slower and reslies on a video element, so I don't even know if that would easily work in the current worker configuration. This would utilize the requestVideoFrameCallback hook and is susceptible to dropped frames so in case of dropped frames we would have to let the video loop, maybe try jumping to the approximate location and hope that the next run will produce that frame. Not very elegeant, fast or reliable. For Firefox there simply isn't anything anymore. There used to be seekToNextFrame which was kinda reliable, but that got removed, probably because firefox now has support for MediaCodecs.

But you're right.. Most people using a locally, self hosted foundry (just the app + ip sharing) will not provide a secure context. The GM connecting essentially to localhost will be treated as such, but not the players...

I should definitely add some checks so we don't even create the worker in that case, but I don't think we can really provide a good fallback solution

Codas commented 1 week ago

I've added feature detection to not even download the sprite sheet worker in case a non-secure context is detected or VideoDecoder specifically is not available.

Since my last update I also added CacheStorage based caching for the compiled sprite sheets. This relies on the same type of cache (file system level) that the browser uses for regular requests. This should be managed by the browser based on recent access patterns, time and available hard disk space. This gives a very nice boost for frequently accessed effects as the texture compression step can be quite slow.

Haxxer commented 1 week ago

Won't the following lines still execute even after the create method has returned null, potentially throwing errors? (to be fair, I have not tested that!)

https://github.com/fantasycalendar/FoundryVTT-Sequencer/pull/275/commits/ad93b000e3d718c18cf0195398990c757ee62d8d#diff-f152739305fc0d3a79d9fe49be29db33dba514a5e8d06abbf5549893edcb7655R109

Codas commented 1 week ago

Won't the following lines still execute even after the create method has returned null, potentially throwing errors? (to be fair, I have not tested that!)

ad93b00#diff-f152739305fc0d3a79d9fe49be29db33dba514a5e8d06abbf5549893edcb7655R109

Damn it yes they would. Will fix :)

Codas commented 3 days ago

Turns out shapes weren't broken to begin with in this branch, I just failed to reload my foundry instance when testing (regarding my comment in issue #334)

fantasycalendar / FoundryVTT-Sequencer

Add spritesheet animation handling. Also fix several minor issues #275