kosme / arduinoFFT

Fast Fourier Transform for Arduino
GNU General Public License v3.0
558 stars 159 forks source link

too much time cost for compute FFT on Esp32-C3 #77

Open blacknull opened 1 year ago

blacknull commented 1 year ago

Here's my code for fft, with 256 samples, 8000Hz sampling frequence

`
unsigned long timeLoop = 0; void loop() { mbegin = micros();

// Compute FFT FFT.DCRemoval(); FFT.Windowing(FFT_WIN_TYP_HAMMING, FFT_FORWARD); FFT.Compute(FFT_FORWARD); FFT.ComplexToMagnitude();

timeLoop = (micros() - mbegin + timeLoop) / 2; // moving average value if (countLoop % 100 == 0) { USB_SERIAL.println("runFFT cost: " + String(timeLoop) + " micro seconds"); } } `

it works fine on esp8266(160MHz), cost 20ms and 10ms on esp32. but on esp32-c3: runFFT cost: 115538 micro seconds runFFT cost: 115510 micro seconds runFFT cost: 115474 micro seconds runFFT cost: 115286 micro seconds runFFT cost: 114535 micro seconds

that's unacceptable... the code is the same, don't know what's wrong with it, anybody can help? thanks.

kosme commented 1 year ago

I have noticed that the library behaves erratically on ESP32 boards when running at max frequency. The workaround I found is reducing the clock frequency to the second-highest one.

blacknull commented 1 year ago

thanks for your reply. I'm sorry for not make myself clear. the code works fine on esp8266 and esp32. but on esp32-c3 which has a risc-v 160Mhz cpu, is 10 times slow than esp32. I switched the cpu from 160MHz to 120Mhz and 80Mhz, nothing going better but worse. the function FFT.Compute(FFT_FORWARD) cost most of time. When I looked into the code in FFT.Compute(), it seemed only sqrt() has heavy duty. so I run a test for 100 times sqrt(), it took 615us only, not guilty. then what's going wrong? thanks.

HorstBaerbel commented 1 year ago

Try the develop branch.

kosme commented 1 year ago

@blacknull Is the issue still present on the newer versions?

kosme commented 10 months ago

Since you mention that other esp32 boards work correctly, I would think that this is likely a hardware/esp core problem. As I asked previously, is the issue still present?

softhack007 commented 10 months ago

on esp32-c3 which has a risc-v 160Mhz cpu, is 10 times slow than esp32.

@blacknull I think what you see is normal behaviour, as -C3 is very slow especially when doing floating point math.

We are using ArduinoFFT inside WLED audioreactive; our performance comparisons point into the same direction as what you observed.

Our explanation is: esp32 (the "classic" one) and ESP32-S3 both have an FPU (floating point hardware acceleration), while ESP32-S2 and esp32-C3 lack FPU and have to use a software emulation which is slow.

The performance drop between esp32 -> esp32-S2 is about 6x, and another 2x drop when using ESP32-C3.

Btw, we are using the develop branch which allows us to perform everything in float instead of double -> 8x faster!

For a forward FFT with 512 samples, we typically see these execution times: