[BUG] sample distortion by overflow/saturation issue of iir_df2t()

johnylin76 commented 3 years ago

In current iir_df2t() implementation, it may produce sample overflow on the calculation of delay and forced saturation (to Q1.31) of output value in each biquad filter stage, which leads to audio distortion in some cases.

For example, the emphasis and deemphasis EQ IIR filter of our use case in the path of multiband DRC. The coefficients of those two-stage biquad EQ IIR filter are as follows:

          a2        a1        b2        b1        b0     shift      gain
emphasis =
{
  biquad[1] =
    -0.02292   0.36471   0.10053  -0.68118   1.00000   0.00000   1.56959
  biquad[2] =
    -0.38907   1.26299   0.56309  -1.50766   1.00000  -1.00000   1.13747
}
deemphasis =
{
  biquad[1] =
    -0.10053   0.68118   0.02292  -0.36471   1.00000   0.00000   0.63711
  biquad[2] =
    -0.56309   1.50766   0.38907  -1.26299   1.00000   0.00000   0.43957
}

https://github.com/thesofproject/sof/blob/master/src/math/iir_df2t_generic.c By the usage of current iir_df2t() would lead to overflow in tmp(L71), delayd, delayd+1, and the incorrect saturation from the first biquad to the second one(L93).

During my verification the above issues have caused notable audio distortion. The following figure is an example of distortion in deemphasis stage.

Screenshot from 2020-12-04 16-13-16 top: input stream, middle: distorted output generated by iir_df2t(), bottom: expected output generated by floating-point calculation

johnylin76 commented 3 years ago

Hi @singalsu , could you help to checkon this issue?

singalsu commented 3 years ago

@johnylin76 Thanks for report! I'll check it.

Meanwhile, have you checked that each of the biquad is scaled with shifts as done in example_iir_eq.m via function eq_iir_blob_quant().

cujomalainey commented 3 years ago

@singalsu the issue appears to be reflected in the comments for making the temp var since you are "Shift Q3.61 to Q3.31" 3 + 31 = 34 > 32 bits which is an overflow of its container. And down below you state tmp as Q1.31 without any changes.

johnylin76 commented 3 years ago

@singalsu the issue appears to be reflected in the comments for making the temp var since you are "Shift Q3.61 to Q3.31" 3 + 31 = 34 > 32 bits which is an overflow of its container. And down below you state tmp as Q1.31 without any changes.

Yeah, what @cujomalainey mentioned is the main problem of iir_df2t() implementation. "tmp" would overflow on casting to int32_t (even with sat_int32 it goes wrong). And we shouldn't saturate output in the middle of biquad stages

singalsu commented 3 years ago

Yep, the Q3.31 is saturated to Q1.31. The scaling of biquads via shifts when creating the blob should prevent this from happening. You can prevent this by downscaling more (one bit?) the problematic biquad and upscaling with last shift the last biquad output.

I'll try to replicate this today. I was under impression the scaling procedure is too pessimistic but with this example it might not be the case. I'd prefer to solve this with down-scaled blob instead of adding complexity to filter code (more cycles to filter core). The saturation is the last measure to prevent fatal overflows.

Edit: Sorry I'm wrong, there's no saturation there in generic C, the HiFi3 version saturates though.

singalsu commented 3 years ago

@johnylin76 Is this decoding of above filters coefficients correct?

function test

%                   a2        a1        b2        b1        b0        shift     gain
emp_biquad1 = [-0.02292   0.36471   0.10053  -0.68118   1.00000   0.00000   1.56959];
emp_biquad2 = [-0.38907   1.26299   0.56309  -1.50766   1.00000  -1.00000   1.13747];
dee_biquad1 = [-0.10053   0.68118   0.02292  -0.36471   1.00000   0.00000   0.63711];
dee_biquad2 = [-0.56309   1.50766   0.38907  -1.26299   1.00000   0.00000   0.43957];

[z0, p0, k0] = get_zpk(emp_biquad1);
[z1, p1, k1] = get_zpk(emp_biquad2);
emp_z = [z0 ; z1];
emp_p = [p0 ; p1 ];
emp_k = k0 * k1;

[z0, p0, k0] = get_zpk(dee_biquad1);
[z1, p1, k1] = get_zpk(dee_biquad2);
dee_z = [z0 ; z1];
dee_p = [p0 ; p1 ];
dee_k = k0 * k1;

alsa_fn = 'eq_iir_emphasis.txt';
coefq = eq_iir_blob_quant(emp_z, emp_p, emp_k);
channels_in_config = 2;
assign_response = [0 0];
num_responses = 1;
coefm = eq_iir_blob_merge(channels_in_config, ...
               num_responses, ...
               assign_response, ...
               coefq);
coefp = eq_iir_blob_pack(coefm);
eq_alsactl_write(alsa_fn, coefp);
eq_blob_plot(alsa_fn);

alsa_fn = 'eq_iir_deemphasis.txt';
coefq = eq_iir_blob_quant(dee_z, dee_p, dee_k);
channels_in_config = 2;
assign_response = [0 0];
num_responses = 1;
coefm = eq_iir_blob_merge(channels_in_config, ...
               num_responses, ...
               assign_response, ...
               coefq);
coefp = eq_iir_blob_pack(coefm);
eq_alsactl_write(alsa_fn, coefp);
eq_blob_plot(alsa_fn);

end

function [z, p, k] = get_zpk(biquad)
    a = ones(1,3);
    a(2:3) = -biquad(2:-1:1);
    b = biquad(5:-1:3);
    b = b * 2^(biquad(6)) * biquad(7);
    [z, p, k] = tf2zp(b, a);
end

Edit: If I change from above "b = b 2^(-biquad(6)) biquad(7);" there would be near +15 dB boost in emphasis and same near -15 dB compensation in de-emphasis. I wonder if this correct?

singalsu commented 3 years ago

@singalsu the issue appears to be reflected in the comments for making the temp var since you are "Shift Q3.61 to Q3.31" 3 + 31 = 34 > 32 bits which is an overflow of its container. And down below you state tmp as Q1.31 without any changes.

Yep, I agree there's mistake in not saturating tmp, just cast (int32_t) is not enough. I will add saturation there like the DRC biquad version has. But when looking the scalings for DRC I don't think it's generic enough. The code needs further robustness with 64 bit saturations added for the recursive side and it increases computational complexity.

I'm still convinced each biquad need to ensure scaling that keeps the internal sum before gain and shift operations less than 1.0 absolute value. It's because the input is Q1.31 and output Q1.31. The gain should be used to boost the amplitude and not decrease (see above de-emphasis). In DRC a Q6 format is used for intermediate. It's OK design choice but not generic.

johnylin76 commented 3 years ago

Thanks Seppo for the great information for the usage of iir_df2t function. I have done some better scaling and gain adjustment to my emphasis and deemphasis filter coefficients. Details are mentioned in https://github.com/thesofproject/sof/pull/3657

I also agree your PR about fixing the overflow biquad output. Thanks

thesofproject / sof

[BUG] sample distortion by overflow/saturation issue of iir_df2t() #3678