Open chcunningham opened 2 years ago
The mediacapture-transform specification does not currently describe how timestamp
is processed.
Related: #96
Potential future issue when spatial scalability is supported:
With spatial scalability, you can have multiple encodedChunks
with the same timestamp
(e.g. base layer as well as spatial enhancement layers). Does this result in the decoder producing multiple VideoFrame
s with the same timestamp? Or does the decoder wait until encodedChunk.timestamp
advances before providing a single VideoFrame
combining all the layers provided?
Currently, we do not configure the operating point in the WebCodecs decoder, so that decoder doesn't know the desired operating point and the layers that the operating point depends on. So at any given timestamp
, the decoder could be provided with just a base layer encodedChunk
per timestamp
or maybe the base layer plus perhaps some spatial enhancement layer frames. It can only know what it has to work with once the timestamp
of encodedChunk
s advances (which adds delay) or if it is configured with the operating point (in which case it can start decoding once it has been provided with all the layers that the operating point depends on).
With spatial scalability, you can have multiple encodedChunks with the same timestamp (e.g. base layer as well as spatial enhancement layers). Does this result in the decoder producing multiple VideoFrames with the same timestamp? Or does the decoder wait until encodedChunk.timestamp advances before providing a single VideoFrame combining all the layers provided?
In this case the decoder would produce multiple VideoFrame's with the same timestamp, but authors would be expected to discard many of these, passing only their desired resolution to MSTG.
This issue had an associated resolution in WebRTC November 19 2024 meeting – 19 November 2024 (Issue #80: Expectations/Requirements for VideoFrame and AudioData timestamps):
RESOLUTION: Add to mediacapture-main extensiblity consideration to make sure sink define their behavior on frame timestamps and file issues on sink specs accordingly
Is it valid to append mutliple VideoFrames or AudioData objects with the same timestamp (e.g. timestamp = 0) to a MediaStreamTrack? If so, what is the behavior? Does the spec describe this?