w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.
https://w3c.github.io/webcodecs/
Other
951 stars 135 forks source link

Synchronised timestamps for AudioPackets and VideoFrames #82

Closed notorca closed 3 years ago

notorca commented 3 years ago

One of the critical features of the media recording process is tracks synchronization. VideoTrackReader and AudioTrackReader are friendly APIs for converting MediaStreamTrack intoAudioPackets/VideoFrames. Still, timestamps in those packets/frames may have different timebase, which makes impossible to have precise presentation timestamps in the resulting media stream. Web APIs already have MediaStream as the representation for the set of synchronized MediaStreamTracks. MediaStream can be used as an optional parameter for VideoTrackReader and AudioTrackReader constructors to notify that AudioPackets/VideoFrames from those readers should have the same timestamp base.

Example:

const stream = await navigator.mediaDevices.getUserMedia(options)
videoTrackReader = new VideoTrackReader(stream.getVideoTracks()[0], stream);
audioTrackReader = new AudioTrackReader(stream.getAudioTracks()[0], stream);
// now timestamps in audio packets and video frames from this readers would have same time base

That could be a bit ticky for streams created with tracks from different sources, like WebAudio and WebRTC, gUM, and canvas, but the best effort like stream presentation timestamp should work well.

tguilbert-google commented 3 years ago

Hello!

FYI, VideoTrackReader/AudioTrackReader are being replaced by MediaStreamTrackProcessor (see #131), and the writer side is implemented by MediaStreamTrackGenerator. You can try these out on the latest Canary already (simple demo here).

I've opened crbug.com/1180325 to double check this within the Chromium implementation. I think this would have already worked for ATR/VTR, even more so for MSTProcessor/MSTGenerator, which is designed to pipe tracks through transform streams in live situations.

So, the two MediaStreamTracks coming from the same MediaStream should already emit AudioFrames/VideoFrames within the same timebase, but please open a bug on crbug.com if you run into issues.

chcunningham commented 3 years ago

FYI, see latest updates on https://bugs.chromium.org/p/chromium/issues/detail?id=1180325 (its not as cut and dry as we thought)

chcunningham commented 3 years ago

With VideoTrackReader now deprecated in favor of MediaStreamTrackProcessor, I've opened a new issue in their repository to track better documentation of sync contracts (or lack there of) https://github.com/w3c/mediacapture-transform/issues/35

chcunningham commented 3 years ago

Closing as I don't think WebCodecs specifically has any action item. Please reopen if needed.

padenot commented 3 years ago

I agree, Web Codecs AudioData and VideoFrames clearly states the timestamp, but it's the responsibility of authors (that implement and e.g. call the demuxer getting audio and video packets from the same file), or APIs generating Video frames to know what the clock domain is.