espressif / esp-dsp

DSP library for ESP-IDF
Apache License 2.0
442 stars 87 forks source link

dsps_fft2r_sc16_aes3 different result from dsps_fft2r_sc16_ansi, incorrect FFT (DSP-102) #67

Closed ayavilevich closed 11 months ago

ayavilevich commented 1 year ago

Environment

Problem Description

Running dsps_fft2r_sc16_aes3 and dsps_fft2r_sc16_ansi gives results that are similar but different. The difference is especially large in the start of the result buffer. This issue makes S3 acceleration problematic to use and rely on. This is supposed to be a key feature of the S3.

Expected Behavior

Expecting optimized and reference implementation to return the same result.

Actual Behavior

Difference as high as 10 (s16) between the results of the various functions.

W (1108) DSP_TEST: Difference between signals x1 and x2 on one plot
I (1118) view: Data min[63] = 0.000000, Data max[1] = 0.000305
 ________________________________________________________________
0-                                                               |
1-                                                               |
2--                                                              |
3----                                                            |
4-------                                                         |
5-------                                                         |
6-------- --- -     -         -                          ---     |
7 - ----------------- ----------- - -     - -- -- --   - ------- |
8   ------- ----------- -----------------------------------------|
9                                                                |
 0123456789012345678901234567890123456789012345678901234567890123
I (1188) view: Plot: Length=1024, min=0.000000, max=0.000305
I (1198) DSP_TEST: view_vector
I (1198) DSP_TEST: view_vector, s=1024, min=0, max=10
I (1208) DSP_TEST: 0: 8
I (1208) DSP_TEST: 1: 10
I (1218) DSP_TEST: 2: 10
I (1218) DSP_TEST: 3: 6
I (1218) DSP_TEST: 4: 8
I (1228) DSP_TEST: 5: 4
I (1228) DSP_TEST: 6: 7

Difference is larger in the first quarter.

Steps to reproduce

Run FFT with both functions on the same input and compare the result.

Code to reproduce this issue

https://gist.github.com/ayavilevich/8f44c89620612284ebe3f055eba13c6d

Debug Logs

console log.txt

Other items if possible

sdkconfig.zip esp-dsp-test.zip

dmitry1945 commented 1 year ago

@ayavilevich thank you,

I will look.

Regards, Dmitry

dmitry1945 commented 1 year ago

Hi @ayavilevich

I have look to your example. Thank you very much. You are first who ask this question about FFT.

The error that you see here is not a real error. I will explain why.

The version for the esp32s3 is highly optimyzed for the speed. That's why the rounding that we use for ansi and for esp32 versions not used here. It slightly reduce the SFDR at low frequencies, but anyway, it's good for most of the applications.

Anyway, it's a good point to include identical fixed point FFT for the esp32s3 to the esp-dsp. I will organize it.

Thank you very much!

ayavilevich commented 1 year ago

@dmitry1945 thanks for looking into it.

I understand that this is due to an arithmetic error in the optimized calculation and not a bug in the implementation.

I wonder if there is way to work around this because the difference really messes up the DC for many input. I think only inputs that use the full dynamic range are immune to it. Is it possible to create an alternative optimized version that will add the missing rounding? It could be a more accurate version that runs slower? Is this a fixed addition that I can subtract from the result?

dmitry1945 commented 1 year ago

@ayavilevich yes, we will add another slower version. Point that, I can't say when we will do this.

About dinamic range. In your example you have max difference - 10. It's 20 dB. But, the ansi FFT already loose some dBs, and finally this algorithms (for esp32s3) will be ~ 10-12 dB worser then ansi one. The 10 dB is only 2 bits, and it's not bad. Most of another fixed points FFTs for another platforms loose much more.

ayavilevich commented 1 year ago

@dmitry1945 Hi,

Any update on this? Perhaps you can publish (here?) an updated assembly function for dsps_fft2r_sc16_aes3 that will add the rounding that we normally have in the ansi and esp32 versions?

dmitry1945 commented 11 months ago

Hi @ayavilevich ,

The problem is if we will use the rounding it will decrease the performance of the FFT. If you need the same SNR as ansi version, you can use esp32 version of FFT, it will work on esp32s3 as well. It will be slower, but will have better SNR.

Regards, Dmitry

ayavilevich commented 11 months ago

@dmitry1945 yes, I understand. I was hoping that using the rounding with an sc16 FFT will still be faster than the "regular" ESP32 FFT. In that case there is value in having such an option as well.