gnif / LookingGlass

An extremely low latency KVMFR (KVM FrameRelay) implementation for guests with VGA PCI Passthrough.
GNU General Public License v2.0
4.72k stars 262 forks source link

[client] audio: adjust playback speed to match audio device clock #946

Closed spencercw closed 2 years ago

spencercw commented 2 years ago

This change is based on the techniques described in [1] and [2].

The input audio stream from Spice is not synchronised to the audio playback device. While the input and output may be both nominally running at 48 kHz, when compared against each other, they will differ by a tiny fraction of a percent. Given enough time (typically on the order of a few hours), this will result in the ring buffer becoming completely full or completely empty. It will stay in this state permanently, periodically resulting in glitches as the buffer repeatedly underruns or overruns.

To address this, adjust the speed of the received data to match the rate at which it is being consumed by the audio device. This will result in a slight pitch shift, but the changes should be small and smooth enough that this is unnoticeable to the user.

The process works roughly as follows:

  1. Every time audio data is received from Spice, or consumed by the audio device, sample the current time. These are fed into a pair of delay locked loops to produce smoothed approximations of the two clocks.
  2. Compute the difference between the two clocks and compare this against the target latency to produce an error value. This error value will be quite stable during normal operation, but can change quite rapidly due to external factors, particularly at the start of playback. To smooth out any sudden changes in playback speed, which would be noticeable to the user, this value is also filtered through another delay locked loop.
  3. Feed this error value into a PI controller to produce a ratio value. This is the target playback speed in order to bring the error value towards zero.
  4. Resample the input audio using the computed ratio to apply the speed change. The output of the resampler is what is ultimately inserted into the ring buffer for consumption by the audio device.

Since this process targets a specific latency value, rather than simply trying to rate match the input and output, it also has the effect of 'correcting' latency issues. If a high latency application (such as a media player) is already running, the time between requesting the start of playback and the audio device actually starting to consume samples can be very high, easily in the hundreds of milliseconds. The changes here will automatically adjust the playback speed over the course of a few minutes to bring the latency back down to the target value.

This graph shows the behaviour at the start of playback when a media player is running in the background; notice how the number of buffered samples starts very high then gradually settles at a more reasonable level. It also shows the typical impulse response when the audio device changes the period size. At 400s, playback in the media player was paused, resulting in the period size changing from 2048 to 1024 samples. At 550s, playback was resumed, resulting in the period size reverting to 2048 samples.

rate_change

This graph shows the number of samples in the ring buffer is stable, even after being left running for several hours.

overnight

[1] https://kokkinizita.linuxaudio.org/papers/adapt-resamp.pdf [2] https://kokkinizita.linuxaudio.org/papers/usingdll.pdf

gnif commented 2 years ago

Oh, also please update the documentation to include the libsamplerate dependency as well as the github workflow.

gnif commented 2 years ago

I am going to branch this as I have a set of improvements/changes also now and will address the changes requested.

gnif commented 2 years ago

This PR has been merged

8BallBomBom commented 2 years ago

Started having issues after this pr. Either audio doesn't play at all or it plays for a little while but then randomly goes silent or glitches. By glitches i mean distortions or playback speed goes up or down, very noticeable. I can see this error in the logs on and off. All i'm doing is playing random yt videos, starting and stopping.

[E]  54102065716             audio.c:566  | audio_playbackData             | Resampling failed: SRC ratio outside [1/256, 256] range.

Also this is how i have the audio config for the vm.

    <sound model="ich9">
      <codec type="micro"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x1b" function="0x0"/>
    </sound>
    <audio id="1" type="spice"/>
gnif commented 2 years ago

We are aware, please don't flood with reports on this feature yet, it's considered experiemental.

8BallBomBom commented 2 years ago

No worries, wasn't sure if it was reported or not, my bad 👍🏻