angelcam / rust-ac-ffmpeg

Simple and safe Rust interface for FFmpeg libraries.
MIT License
197 stars 33 forks source link

Blips in audio when PTS values are incorrect #32

Open dceddia opened 3 years ago

dceddia commented 3 years ago

I stumbled upon a file that was playing back with weird audio artifacts (popping), but worked fine in other players (QuickTime, VLC, ffplay all worked fine). I added some prints to resampler.c inside the compensation code in ffw_audio_resampler_push_frame, as well as printing out the before/after info about the AudioFrame in Rust. It comes out like this:

[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "0/48000") to 925 samples at 44100Hz (pts "0/44100")
[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "1024/48000") to 941 samples at 44100Hz (pts "925/44100")
[ffw_audio_resampler_push_frame] dropping samples, pts_delta is -8
Resampled audio from 1024 samples @ 48000Hz (pts "2040/48000") to 933 samples at 44100Hz (pts "1866/44100")
[ffw_audio_resampler_push_frame] dropping samples, pts_delta is -16
Resampled audio from 1024 samples @ 48000Hz (pts "3048/48000") to 927 samples at 44100Hz (pts "2799/44100")
[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "4072/48000") to 941 samples at 44100Hz (pts "3726/44100")
[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "5096/48000") to 941 samples at 44100Hz (pts "4667/44100")
[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "6120/48000") to 941 samples at 44100Hz (pts "5608/44100")
[ffw_audio_resampler_push_frame] dropping samples, pts_delta is -16
Resampled audio from 1024 samples @ 48000Hz (pts "7128/48000") to 926 samples at 44100Hz (pts "6549/44100")
[ffw_audio_resampler_push_frame] injecting silence, pts_delta is 32
Resampled audio from 1024 samples @ 48000Hz (pts "8184/48000") to 971 samples at 44100Hz (pts "7475/44100")
[ffw_audio_resampler_push_frame] frame is fine
Resampled audio from 1024 samples @ 48000Hz (pts "9208/48000") to 940 samples at 44100Hz (pts "8446/44100")

I think part of the problem is that this video was probably encoded badly. Other players seem to handle it fine though, and I'd like mine to handle it fine too 😄 It seems reasonable to me to ignore the junk timestamps and go off the length of the samples themselves, so for now, I've taken out the code that does the compensation it's playing fine with no blips.

I saw that ffmpeg has a swr_next_pts function that looks similar to what the code in ffw_audio_resampler_push_frame is doing, and has a few modes controlled by some variables on the context. It looks like it can effectively be "off", "soft compensation", or "hard compensation". ac-ffmpeg's implementation looks most similar to hard compensation. Maybe ac-ffmpeg could have a similar set of flags, or use swr_next_pts directly?

operutka commented 2 years ago

Sorry for the delay, I was out of office for a few weeks. You're right that the audio resampler is quite aggressive. It crops audio frames or adds silence whenever the timestamp sequence does not match the actual number of samples.

It's quite crucial for us at Angelcam because we use it to correct crappy audio received from IP cameras. We haven't tested the swr_next_pts() function yet. I'll definitely take a look at it. It's important for us to keep the current behavior. Additional options would be useful though.