w3c / media-source

Media Source Extensions
https://w3c.github.io/media-source/
Other
267 stars 59 forks source link

Consider defining clearly the PTS and DTS sourcing in each bytestream spec #292

Open wolenetz opened 2 years ago

wolenetz commented 2 years ago

To improve readability, especially when other specifications like WebCodecs, WebRTC, HTML.rVFC, etc have potentially different semantic around "timestamp" or "presentation timestamp", it would be good to further clarify in each MSE bytestream format what the PTS and DTS of an MSE coded frame originate from in the underlying format.

See also https://github.com/w3c/webcodecs/issues/107#issuecomment-898311541

cconcolato commented 2 years ago

For ISOBMFF, here is what I would propose:

But reading MSE I see the following definition :

The decode timestamp indicates the latest time at which the frame needs to be decoded assuming instantaneous decoding and rendering of this and any dependant frames (this is equal to the presentation timestamp of the earliest frame, in presentation order, that is dependant on this frame)

The part in parenthesis is not always true for ISOBMFF. Consider the case of an edit list shifting the presentation forward by 10s (media rate 1, media time -1, edit_duration 10s), the presentation of the first frame will be 10s while its decode time will be 0. IIUC, it's not explicitly permitted but not excluded either in the ISOBMFF Byte Stream Format spec.