Closed LazyDuchess closed 10 months ago
hah, I was just thinking about if reia decoding could be done in a compute shader for performance just today.
had it in the back of my mind for a while :p
I don't think the format can be fully decoded in parallel, since all the blocks are different sizes there would be no quick way to get from an invocation id to a block to decode. You could decode a bunch of frames at the same time, but that's not very useful for a stream. The block size is also rather large, so small videos wouldn't even fill a single warp.
You could preprocess the frame to find block locations (single thread, so should be on CPU), but then you're doing a lot of the work twice.
Yeah I was thinking of pre processing
Right now, the Reia player is really bad at playing bigger videos, like the 512x512 loading screens. Probably the constant Texture2D allocs are to blame. Performance is very poor when streamed, and they take a ton of memory if not streamed on top of taking forever to load.
While converting these to a more common format like MP4 on first time launch could work, or pooling textures/reducing their creation, I think implementing Reia playback as a compute shader could be a better solution.