Vanilagy / mp4-muxer

MP4 multiplexer in pure TypeScript with support for WebCodecs API, video & audio.
https://vanilagy.github.io/mp4-muxer/demo
MIT License
419 stars 32 forks source link

Audio/Video gradually de-synchronize in long videos (30+ minutes) #71

Open Dan-Homerick-on24 opened 2 weeks ago

Dan-Homerick-on24 commented 2 weeks ago

I've actually found a fix for this one already. What's happening is that as samples are added to the mp4, they are given an integer duration in ticks (timebase units). To get an integer, mp4-muxer is rounding, which introduces a small error. As the file grows larger, it accumulates an error in the total duration (it would be nice if the + and - errors averaged out, but they don't. Instead it takes a random walk, like a coin being flipped).

The audio and video each accumulate their own rounding-induced duration errors, and they drift apart over time.

The fix is to round to an integer, but keep track of the fractional remainder. By carrying the remainder forward to the next duration, the errors don't accumulate, and the audio and video stay in sync.

I have my implementation of the fix here. I've tested it and verified that it fixes the drift issue. It's a pretty simple tweak: https://github.com/Dan-Homerick-on24/mp4-muxer/tree/duration-fraction-carry

diff: https://github.com/Dan-Homerick-on24/mp4-muxer/commit/fc052b29d37a16ad9310c4a53f56a85866b7f30d

Cheers!

Vanilagy commented 2 weeks ago

Hi Dan, that's lovely, thanks for coding the fix! Fascinating how there are still little bugs hiding in the timing code.

Given that you already have this implemented and on GitHub, would you mind closing this issue and opening a pull request instead? I'll look into it later then.

Vanilagy commented 2 weeks ago

Okay so I just checked my original code again and I'm actually not sure how it is bugged. Maybe you can clarify it for me, because I can't see it.

This part:

let timescaleUnits = intoTimescale(sample.decodeTimestamp, track.timescale, false);
let delta = Math.round(timescaleUnits - track.lastTimescaleUnits);
track.lastTimescaleUnits += delta;
track.lastSample.timescaleUnitsToNextSample = delta;

I don't see how this would cause long-term drift. timescaleUnits (float) is always exact as it's non-rounded. I also keep track of where the mp4 file currently thinks we are in lastTimescaleUnits (int). By subtracting lastTimescaleUnits from timescaleUnits, I compute how many timescale units I need to advance to "catch up" to the current sample. Of course, I then need to round that number, but I also add the rounded number in the accumulator ( lastTimescaleUnits). All of this ultimately means that the timestamps in the resulting mp4 file should never be more than half a timescale unit away from their actual original timestamp.

Could you elaborate on the flaw in this algorithm?

Dan-Homerick-on24 commented 1 week ago

I think you're right -- your current algorithm should work. It doesn't make sense to add a second approach with the same goal. It may take me a couple days to make time for it, but I'll try to get to the bottom of what's going on.

Vanilagy commented 1 week ago

Sure! Reach out to me once you know more.