w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.
https://w3c.github.io/webcodecs/
Other
950 stars 134 forks source link

Expose in VideoFrameMetadata some fields from VideoFrameCallbackMetadata #601

Open youennf opened 1 year ago

youennf commented 1 year ago

It seems useful to expose some fields in VideoFrameMetadata like:

This could be added as an entry in VideoFrameMetadata registry.

dalecurtis commented 1 year ago

These sound fine to me, but @aboba points out that https://wicg.github.io/video-rvfc/ may not meet the registry requirements yet.

I know at TPAC there was some talk of moving rVFC into the HTML spec. Do we still want to do that?

@chrisn @tguilbert-google

youennf commented 1 year ago

I think the idea would be to write a registry entry that would define these values without referencing rvfc. Instead, it would reference webrtc-pc and/or media capture-main.

tguilbert-google commented 1 year ago

Yes, I was still planning on moving rVFC to the HTML spec. I haven't found the time to do so yet. Embedding the spec into the HTML spec is more work than just referencing it.

Previous (closed) PR tracking this work: https://github.com/whatwg/html/pull/5332#issuecomment-1251535284

chrisn commented 1 year ago

I agree, adding this into HTML makes sense.

aboba commented 1 year ago

The RVFC metadata items seem quite useful to me, but they also raise some questions for the behavior of other APIs:

  1. In the VideoFrame(s) produced by MediaStreamTrackProcessor, should we expect capturetime metadata to be present?
  2. Is there an expectation that the WebCodecs encoder will carry Videoframe metadata items through to encodedChunk metadata? (e.g. capturetime)
  3. Is there an expectation that the WebCodecs decoder will carry encodedChunk metadata items through to VideoFrame metadata? (e.g. receivetime)

Without an explicit change to the WebCodecs API, I'm assuming that the answer to questions 2 and 3 is "no". But what I'd really like to be able to do is to trace performance through each stage of the receive and send pipeline.

youennf commented 1 year ago
  1. In the VideoFrame(s) produced by MediaStreamTrackProcessor, should we expect capturetime metadata to be present?

It depends on MediaStreamTrack's source. If source is getUserMedia, yes.

2. Is there an expectation that the WebCodecs encoder will carry Videoframe metadata items through to encodedChunk metadata? (e.g. capturetime)

EncodedVideoChunk has no metadata so I would think that it would not be carried automatically. webrtc encoded transform could decide to expose it in RTCEncodedVideoFrameMetadata but this is out of scope of web codecs.

3. Is there an expectation that the WebCodecs decoder will carry encodedChunk metadata items through to VideoFrame metadata? (e.g. receivetime)

My understanding is that the decoder will not have that data so will not set it. The application wrapping the decoder will have this data (say the WebRTC pipeline) and can cheaply attach it to the video frame.

aboba commented 1 year ago

We now have a proposal to add some timing information to the RTCEncodedVideoFrameMetadata: https://github.com/w3c/webrtc-encoded-transform/pull/173

Would it make sense to define a metadata registry for encodedChunk and add similar info there?

guidou commented 3 months ago

Is it possible to make progress with this? Recently, we have received requests from developers asking for this, in particular capture time, to be exposed via MediaStreamTrackProcessor.

Djuffin commented 3 months ago

Why can't we use regular VideoFrame.timestamp for capture time when frames are coming from a capture device.

tguilbert-google commented 3 months ago

@guidou reached out to me last Friday about surfacing capture timestamps, proposing VideoFrame.timestamp. My apologies, I didn't follow up after discussing with @Djuffin, and how VideoFrame.timestamp was a valid path forward.

The MediaStreamTrackProcessor spec could define how the timestamp is populated, and might not need a WebCodecs spec change?

Djuffin commented 3 months ago

We should probably discuss it in the next WG meeting. Extra VideoFrame metadata isn't very useful unless we carry it with video chunks and handle it in encoders and decoders.

guidou commented 1 month ago

Why can't we use regular VideoFrame.timestamp for capture time when frames are coming from a capture device.

We can do that. However, when writing a VideoFrame to a VideoTrackGenerator it is useful to know that the timestamp is a capture timestamp (or alternatively, that the VideoFrame should be treated as a frame coming from a capturer) since the downstream sinks (e.g., a Peer Connection) may treat the frame differently from non-capturer frames. Currently it is not possible to make this distinction, even if the timestamp of capture VideoFrames is the capture timestamp.

guidou commented 1 month ago

Extra VideoFrame metadata isn't very useful unless we carry it with video chunks and handle it in encoders and decoders.

We have use cases for VideoFrame that don't involve encoders or decoders (at least WebCodec ones) that would benefit from this metadata. These use mediacapture-transform.

alvestrand commented 1 month ago

Seems that when we have metadata on VideoFrame and metadata on RtcEncodedVideoFrame, and we're not looking at an option to make WebCodecs produce/consume RTCEncodedVideoFrame, there's an argument to make that we should be adding metadata to EncodedVideoChunk too, so that it can be carried throughout the ecosystem.

padenot commented 1 month ago

Yes, this was requested in the past, but I don't remember by who or for what. But it would be useful for sure. I think it was someone working in the cinema industry, but I could be completely misremembering.

aboba commented 1 month ago

@alvestrand @padenot Here are the Issues relating to Encoded*Chunk metadata: https://github.com/w3c/webcodecs/issues/245 https://github.com/w3c/webcodecs/issues/189