chipaudette / OpenAudio

Software for Processing Audio
31 stars 8 forks source link

FIR_FFT_Benchmarking.ino KISS FFT memory failure #1

Open manitou48 opened 7 years ago

manitou48 commented 7 years ago

Not a big deal, since the first iteration of KISS_FFT benchmark works, but 2nd and 3rd iteration report KISS FFT: N = 1024 in * NOT ENOUGH TEMP MEMORY * Is there a memory leak?

FYI, I ran the benchmark on Teensy 3.2/3.5/3.6 and confirmed your results. Also ran it on dragonfly (STM32L4 @80mhz) and mbed K64F @120mhz, speed up to T3.2@96mhz for float 128: dragonfly 7.7 mbed k64 12.97

the mbed k64 is faster than the teensy 3.5 probably because mbed uses ARM gcc with -O3

chipaudette commented 7 years ago

Hey, thanks for giving it a try! Fantastic!

Regarding the "Not Enough Temp Memory", yes, I saw that behavior as well. The KISS routine was originally written for desktop machines and was filled with malloc() calls. I think that I removed most of those, but I'm not sure that got them all.

Also, the algorithm uses recursion which causes the stack to grow as local arrays are allocated with each call. While they should be correctly released as the recursion is unwound, I fear that there is memory fragmentation happening that may prevent the biggest arrays from getting allocated the second time through.

Since I'm not going to use the KISS routines for my projects (I'll use the CMSIS library), I wasn't too concerned about fixing it.

Thanks for the additional results on the other platforms! I'm hoping to do a trial with the STM32 (F4?) 180 MHz part when the Arduino-STM collaboration is released.

chipaudette commented 7 years ago

Oh, since the CMSIS FFT is so good, do you think that there's any value in trying to fix the KISS memory issue? If not, I might close the issue as one of those "Not Going to Fix" issues...

manitou48 commented 7 years ago

Well, if it's an easy fix for KISS, maybe yes ... otherwise no.

here are ARMf/float numbers for mbed Nucleo-F446RE STM32F446RET6 @180MHz ARM FFT: N = 1024 in 619.5 usec per operation ARM FFT: N = 512 in 314.6 usec per operation ARM FFT: N = 256 in 133.4 usec per operation ARM FFT: N = 128 in 68.4 usec per operation ARM FFT: N = 64 in 28.3 usec per operation ARM FFT: N = 32 in 15.1 usec per operation ARM FFT: N = 16 in 5.8 usec per operation ARM FFT: N = 8 in 3.2 usec per operation