Ralith / oddio

Lightweight game audio
Apache License 2.0
146 stars 9 forks source link

Consider pluggable interpolation #12

Open Ralith opened 3 years ago

Ralith commented 3 years ago

Resampling in stream::Receiver and SamplesSource is currently hardcoded to use linear interpolation. Higher quality might be obtained by using higher-order polynomial interpolation, at a latency and CPU time cost. It's unclear if the quality difference would be meaningful.

hasenbanck commented 2 years ago

I did an extended test of higher-order polynomial interpolation. I tested some polynominals proposed by Olli Niemitalo in "Polynomial Interpolators for High-Quality Resampling of Oversampled Audio".

In general polynomial interpolation can't compete with polyphase based solutions like SOX. But we can do better than the current linear interpolation.

The polynomials I experimented with always had tradeoffs in compute complexity, distortion or attenuation in higher frequencies, meaning that for example 4th order b-splines had very good intermodular distortion, but attenuated higher frequences very aggresively.

A replacement or alternative for the linear interpolation should not be worse in either distortion or in attenuation of higher frequences. Only a higher compute complexity should be acceptable.

I don't have access to the Audio Precision software suite (the "standard"), so I used Right Mark Audio to compare the resampling quality when resampling a 44.1 kHz test sound to 48 kHz. I included SOX in the test as the "gold standard" to beat.

rma

"Linear" is the current linear interpolation. "Lagrange 4p" is a 4 point, 3rd order polynominal interpolation. "Lagrange 6p" is a 6 point, 4th order polynominal interpolation. "sox vhq" is SOX used with the very high quality preset.

We can clearly see that Lagrange 4p and 6p would be an upgrade to the linear filter.

For example the THD: Linear

thd-linear

Lagrange 4p

thd-lagrange-4p

Lagrange 6p

thd-lagrange-6p

Sox

thd-sox

The frequency response is better than linear, but we still attenuate before reaching 20 kHz. There is also no low pass filter:

frequency-response

The IMD sweep test is also better, but still not great. The curves look awfully similar for all polynomial interpolations, so this is maybe an artifact of geometric interpolation.

imd-sweep

I did a quick benchmark comparing the runtime of oddio::run() while converting the one minute 44.1 kHz test file to the 48 kHz target:

Linear: Release mode: 49034 us (49 ms) Debug mode: 1396284 us (1396 ms)

Lagrange 4 point: Release mode: 75987 us (75 ms) +53% when compared to linear Debug mode: 1954187 us (1954 ms)

Lagrange 6 point: Release mode: 90049 us (90 ms) +83% when compared to linear Debug mode: 2528852 us (2528 ms)

I created a branch with the test code and the Lagrange 6 point interpolation: https://github.com/hasenbanck/oddio/tree/lagrange

Ralith commented 2 years ago

Thanks for this investigation! Sounds like the perf impact is significant, but not exactly debilitating considering the low absolute cost. I think there's a good case for moving ahead with a pluggable interpolator, though I'm not immediately certain what the trait should look like. If only associated constants worked in array lengths...

Lagrange 4p

How does this compare to cubic (or is it a synonym?)

hasenbanck commented 2 years ago

Lagrange 4p

How does this compare to cubic (or is it a synonym?)

As far as I understand, "cubic" interpolation is called every interpolation that is using third-degree polynomial to interpolate. Olli Niemitalo says in the paper, that linear and cubic interpolations are all piece-wise polynomial interpolators. He than compares different polynomial interpolators in the paper and Lagrange is one of the polynomial that he uses. 4p uses four points and is a third-degree polynomial. 6p uses five points and is a four-degree polynomial. More common cubic interpolators use the cubic Hermite interpolator and / or it's special case the Catmull–Rom spline.

It has to be said that the paper analyses the usage of those interpolators on oversamples data (using a FIR filter).