microsoftgraph / microsoft-graph-comms-samples

Microsoft Graph Communications Samples
MIT License
211 stars 236 forks source link

PolicyRecordingBot and FFMPEG: How do you recover video from received frames and timestamps? #731

Open kokosda opened 5 months ago

kokosda commented 5 months ago

PolicyRecordingBot receives H264 frames (Buffer.Data) and some metadata related to them like Timestamp. This data can be read from VideoMediaReceivedEventArgs object. As I understand, those frames are inter-frame compressed. Also, I noticed that resolution of those frames can change. Timestamps are like this one 39264692280552704. That represents year 125 if fed to new DateTime(39264692280552704), so I need to add 1899 years to get a real date. I merged all the H264 frames in one file like input.h264 and saved all the timestamps in another file like metadata.json. In metadata.json, each object describes a single frame from input.h264.

My question is how to recover the source video from frames and timestamps that I received from Teams? Particularly, using FFMPEG.

InDieTasten commented 5 months ago

The video streams received via VideoMediaReceived events are h264 NAL units. You are correct, that these are inter-frame compressed. You can containerize the h264 stream into a playable mp4 file for playback. However, the video stream can contain variable framerate and dimensions. You can reconstruct the frame timings using the timestamps as you suggested. However, there is no direct way to input these timings into ffmpeg for containerization afaik. However, there are other tools available that can properly "duplicate" frames in VFR streams to make them constant framerate, eg. 30fps, which do support specifying exact frame times for each NAL unit.

If you'd like to know more, feel free to contact me directly. It's not so much a graph comms issue, and more a ffmpeg, media processing issue that doesn't really relate to the samples in a significant way.