projectNe10 / Ne10

An open optimized software library project for the ARM® Architecture
Other
1.46k stars 408 forks source link

fix: fft missing break #273

Open mdionisio opened 3 years ago

mdionisio commented 3 years ago

I'm not sure about the fix and I'm not able now to execute test because I'm on on intel machine. But only reading the code it seams that there is a missing break.

peter-toft-greve commented 2 years ago

Yeah - looks like that

peter-toft-greve commented 2 years ago

Anyone reading this other than @mdionisio ?

mdionisio commented 2 years ago

I can only say that:

if 'ne10_radix_8_butterfly_float32_c' is well implemented my fix is correct.

The previous one was not bugged for result because the generic function is called but is bugger for performance because the output is computed 2 times.

So with my fix the _c version of fft has better performance.

I'm not able to run test on arm with neon now. But in teory if test continue o run correctly it means that all is ok because the neon version is not changed