w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.
https://w3c.github.io/webcodecs/
Other
953 stars 135 forks source link

Figuring out Decoder VideoFrame pool size #612

Closed RobertVillalba closed 1 year ago

RobertVillalba commented 1 year ago

Hello, I've been using the decoder and have noticed that decoding hangs if the previously decoded frames are not closed. I've looked at various methods for caching that have been posted here, but they all seem to have their own limitation and thus I'd like to try to work within the bounds of the VideoFrames that I'm allowed to use by default. I've seen this value be capped anywhere between 8 and 13 frames and it does not seem to be correlated with frame size. I was wondering if there was a simple way to tell how many frames I would have available to use as a small cache or if the best way to do this was to simply continue feeding the decoder frames until progress stops. The later option seems a bit crude and perhaps there is a better way. Thanks in advance!

dalecurtis commented 1 year ago

No, this unfortunately isn't something that can be known ahead of time and will vary (effectively) randomly. In an ideal scenario you'll decode only the frames you need at that exact moment in time and use the 'ondequeue' events to detect when space opens up in the decoding queue.

RobertVillalba commented 1 year ago

@dalecurtis thank you for the quick reply! When you say this is effectively random do you at least mean from video to video (I can hold 8 frames for video_a and 9 for video_b) or do you mean even within the same video (I can hold 9 frames at the start of a video, but further into the video I can only hold 8)? If the later is the case, is there at least some minimal number of frames that I can be guaranteed, even if that number is small (hopefully > 1)?

(edit) After further testing I see that this number really does seem to be random even within the same video. I looked into the ondequeue callback, but this does not seem to provide any additional insight that I don't already receive from the output callback and I do not see a correlation between the decodeQueueSize and if a prior VideoFrame needs to be closed or not in order to free up the decoder. Short of that is there some sort of minimal number of frames that I know I can hold without locking up the decoder?

I suppose the feature request that could come out of this is some sort of way of detecting that the Decoder is backed up and wont be producing any more output until VideoFrames are closed.

dalecurtis commented 1 year ago

There's no information on that from the OS level APIs generally, even internally we guess about this within Chrome or just always assume we can only get one buffer the majority of the time: https://source.chromium.org/search?q=CanReadWithoutStalling%20-f:debug%20file:media%2Fgpu%20-file:.h$&ss=chromium

So even if we wanted to it's not something that is exposable in the majority of cases. When we pre-buffer for the <video> tag we just assume we can only have one buffer (yet queue up say N decodes) prior to playback. The WebCodecs equivalent would be submitting the number of chunks ahead of time and triggering playback when you get the first frame.

Can you describe what you're trying to do?

RobertVillalba commented 1 year ago

I see, it's quite surprising that this info is not available from the OS level API, but I don't know much about that area. Either way, thank you for the insight! I will likely have to end up with some hybrid approach of what you just described and limited use of createImageBitmap to try to do some minor amount of caching. Although I'm not sure what limitations createImageBitmap might have with regards to memory consumption or if we can even know that ahead of time.

My specific use case is that I'm trying to create a video player that provides frame by frame navigation as well as some other features such as reverse playback. I've placed the latter on ice due to the caching limitations, but doing frame by frame backwards still seemed like it would be feasible. I was hoping a small cache of frames would diminish the number of times I'd have to decode the same GOP, but not having a consistent number (> 1) complicates things a bit.

dalecurtis commented 1 year ago

createImageBitmap should be unlimited until you run out of memory, so if you want to avoid having to worry about cache sizes, that's the recommended path. Using createImageBitmap for past frames seems like a reasonable idea to me. It's the only way you'll get large backward frame step operations working properly.

RobertVillalba commented 1 year ago

Thank you for all the info. I was originally trying to avoid the performance penalty of createImageBitmap, but it seems for now there is no way around that. Thank you again for all of your help it is much appreciated!