w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.
https://w3c.github.io/webcodecs/
Other
975 stars 137 forks source link

Relationship between VideoFrameMetadata and VideoFrameCallbackMetadata #599

Open youennf opened 1 year ago

youennf commented 1 year ago

https://wicg.github.io/video-rvfc/#dictdef-videoframecallbackmetadata defines some metadata that might be useful to be in VideoFrameMetadata. It might also be useful to expose future VideoFrameMetadata fields in VideoFrameCallbackMetadata. The following could be done:

  1. VideoFrameCallbackMetadata derives from VideoFrameMetadata
  2. Move some metadata from VideoFrameCallbackMetadata to VideoFrameMetadata: captureTime, receiveTime and rtpTimestamp.

@tguilbert-google, @dalecurtis, thoughts?

tguilbert-google commented 1 year ago

What advantage does VideoFrameCallbackMetadata get from deriving from VideoFrameMetadata, versus having a VideoFrameMetadata member instead? Is this to avoid the duplication of fields if they are moved around?

I assume all fields in VideoFrameMetada will be optional. Otherwise, this puts an extra burden on video.requestVideoFrameCallback implementations.

dalecurtis commented 1 year ago

It's a good question. Thanks for asking @youennf.

I think they're a bit unwieldy for direct derivation in either direction. E.g., Presentation time metadata and the fact that VideoFrameCallbackMetadata has a mediaTime field, but with VideoFrame we expect that to be the timestamp field.

I prefer just cross-pollinating the fields we think are important between the two without worrying too much about perfect sync. I.e., I agree the ones you list in 2 seem reasonable to add to VideoFrameMetadata.

youennf commented 1 year ago

The main question is whether there will be any field in VideoFrameMetadata we would not want to expose in VideoFrameCallbackMetadata. AIUI, we will always want to expose VideoFrameMetadata fields in VideoFrameCallbackMetadata. It does not seem great to have to update VideoFrameCallbackMetadata spec and implementation every time we add a VideoFrameMetadata field in the registry.

Hence why deriving VideoFrameCallbackMetadata from VideoFrameMetadata is nice (one issue is name clashing though). Composition is good too, I guess we could add a VideoFrameMetadata other to VideoFrameCallbackMetadata for instance. But this would duplicate some existing fields we might want to upstream to VideoFrameMetadata, as @tguilbert-google pointed out.

I'll file an issue about adding captureTime, receiveTime and rtpTimestamp to VideoFrameMetadata.

dalecurtis commented 1 year ago

I think that's actually a bit neat in reverse. So I'm in favor of VideoFrameCallbackMetadata inheriting from VideoFrameMetadata assuming we can get all those fields added to the registry. Our requirements for registry entry are much stricter than what VideoFrameCallback defined though, so that may be a bit tough.

youennf commented 1 year ago

I'm in favor of VideoFrameCallbackMetadata inheriting from VideoFrameMetadata

I think that we are in agreement, I am pushing for something like dictionary VideoFrameCallbackMetadata : VideoFrameMetadata { ... }. In that world, VideoFrameCallbackMetadata would keep some fields like timestamp.