androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
https://developer.android.com/media/media3
Apache License 2.0
1.74k stars 416 forks source link

Audio sync issue with mismatched track lengths #921

Open jordond opened 11 months ago

jordond commented 11 months ago

Version

Media3 1.2.0

More version details

No response

Devices that reproduce the issue

All

Devices that do not reproduce the issue

No response

Reproducible in the demo app?

Not tested

Reproduction steps

If I have a list of video files that have mismatched track duration's. Ex: Video track is 3000ms but audio track is 3030ms.

Then pass that to Transformer, the audio will eventually desync from the video. I have tried to fix this by using MediaExtractor to extract the duration of the audio and video track. Then take the shortest of those two durations, and pass that into the MediaItem.setClippingConfiguration().

This helps reduce the audio lag, but it still gets out of sync.

I will be emailing a zip file containing an Android application where you can reduce this.

Once you run the app there are three different clipping options:

  1. None
  2. Shortest Track
    • This will do what I mentioned above and take the shortest of the two tracks
  3. Custom
    • This allows you to set a custom endPositionMs for the ClippingConfiguration

I have the third option, because once clipped to 1000ms, there is no sync issue whatsoever. So in my testing I tried lopping off more and more milliseconds to see what the minimum number of milliseconds I needed to trim to keep it in sync. But that ended up losing too many frames.

Expected result

The exported video file's audio and video should be synced.

Actual result

The audio slowly drifts out of sync, and by the end of the video the audio is drastically out of sync.

Media

I will be emailing a reproduction project, which contains all the assets and a application to test it.

Bug Report

Samrobbo commented 11 months ago

@droid-girl @tof-tof is this in either of your remits?

jordond commented 10 months ago

@Samrobbo

I have made a discovery that fixes the audio sync issue.

If you clip the videos to the shortest duration first. Then pass all those files to be concatenated, the audio is in sync.

This "fix" isn't ideal as now you have to Transform each video file into a temporary file, then pass those temporary files to Transformer to be concatenated.

But at least it gives you guys somewhere to look.

I have sent another email to android-media-github@google.com with an updated Repro project that includes the option to clip the videos separately first.

droid-girl commented 10 months ago

@jordond : just to clarify, you have 2 EditedMediaItemSequence where in the first sequence you have a list of video items (total duration 3000ms) and in the second EditedMediaItemSequence you have an audio track that you want to play along with the first sequence (total duration 3030ms)? If that is the case, when you define your audio sequence, do you set isLooping to true?

val backgroundAudioSequence = EditedMediaItemSequence(
   ImmutableList.of(backgroundAudio),
   /* isLooping= */true)
jordond commented 10 months ago

@droid-girl Sorry no. This issue happens with a single EditedMediaItemSequence.

Each EditedMediaItem is a video that might have an audio track that is slightly longer or shorter than the video track.

ie: video track 1000ms, audio track 1003ms

When creating a EditedMediaItemSequence of all these videos, as the transformed video plays the audio slowly will drift out of sync with the videos.

I originally attempted to fix this by using ClippingConfiguration to trim the video to the shortest track length. In the example above I would have a ClippingConfiguration with startMs = 0 and endMs = 1000.

This sort of worked, but in a long enough Sequence the audio would drift again.

I did find a workaround though:

  1. For each source video:
    • Create a clipping configuration to the shortest track ie 1000ms
    • Build a EditedMediaItemSequence with the single video file
    • Pass it to Transformer and save the file to the cacheDir
  2. Once all the videos have been pre-clipped, create a EditedMediaItemSequence that contains all of the pre-processed videos from the cache directory

This works for some reason, but it means the whole process takes longer and requires the intermediate video files.

I did sent a repro project to the specified email that demonstrates this behaviour

droid-girl commented 10 months ago

Thank you for reporting this and sending the repro project. We will take a look at the issue

tof-tof commented 10 months ago

Thanks for the info @jordond, I think I know why this happens, sharing technical details here so my team can have context.

We use Exoplayer's ClippingConfiguration to clip the video and based on the requested clip times and the encoding of the buffers, we may not be able to make a the input MediaItem clip to the exact length (usually no more than a millisecond off though). However, Exoplayer reports the duration the video to clipDuration=clipEndTime-clipStartTime, so when we join the videos in Transformer, the offsets from one MediaItem to the next are slightly off, which will accumulate over multiple videos to create the problem described. If for multiple EditedMediaItems in a sequence:

This is going to be tricky to fix, given that we don't know the true accurate duration until we receive the last audio input buffer from exoplayer. Probably going to have to try and drop extra audio buffer's somewhere so that the real audio duration is no more than the clip duration, probably in the assetLoader.

Will talk with team and update on resolution soon.

jordond commented 10 months ago
* the real audio duration is greater than the clip duration, then we have a problem (the video will seem slightly ahead of the audio - @jordond can you confirm this is the case in the transformed video?)

Yes that's what I'm experiencing 👍

Will talk with team and update on resolution soon.

Sounds great! Thanks for looking into it.

jordond commented 8 months ago

Any updates on this issue?

tof-tof commented 8 months ago

we are looking into ways to solve this issue, there's a lot of different proposals. We should have something soon.

tof-tof commented 7 months ago

handing this over to @ychaparov since we have decided on a way forward to fix the issue and he will be landing the change.

Yordan, please remember to put the link to this github issue in the cl description so the bot can auto-update and link the corresponding github commit

ychaparov commented 6 months ago

Hi @jordond -- the recently landed fix, https://github.com/androidx/media/commit/b9ec24a2696b79d3b5f56d08dd9a90ae807d530b , might alleviate this problem for cases when the audio track is shorter than the video track.

Do you frequently encounter files where the video track is shorter than the audio track? What's their source?

jordond commented 6 months ago

Awesome! Will this be available in 1.4.0?

We are using FFMPEG to trim videos to a desired length (usually one second). And for whatever reason sometimes the track lengths don't match up. We're still looking into that issue, but we will still have users with legacy videos with mismatched tracks.

ychaparov commented 6 months ago

Yes, this should be available in the next release.

It will be a bigger change for us to support the case when video track is shorter than audio. We will eventually fix it, but the timeline is unclear.