xcore / sc_dsp_transforms

DSP domain transformation components: FFT and DCT
http://github.xcore.com/sc_dsp_transforms/
Other
7 stars 8 forks source link

Fixed point FFT #1

Closed lilltroll77 closed 12 years ago

lilltroll77 commented 12 years ago

The 64 bit MACS is a little bit better compared to a typical 40 bit.

Have you made any calculations on the possible SNR using FFT for FIR filtering overlap and add.

I have never written a low level FFT implementation for an ISA my self like XMOS.

Is it possible to reach >100 dBFS without using extended precision.

henkmuller commented 12 years ago

I could try and write an FFT; using either high or normal precision arithmetic. Would this mean that all intermediate results are 64-bits wide? Or do you just compute one intermediate result at 64 bits and then reduce the intermediate result to 32 bits?

X0..XN are stored in K bits, where K is either 32 or 64. Root0..RootN are stored in 32 bits.

X0' := Root0 * X0 + Root1 * X1 X1' := Root0 * X0 - Root1 * X1

If K is 32, then the * is is 32x32 -> 64, and the + is 64+64 -> 64, the := is 64 -> 32 If K is 64, then the * is is 64x32 -> 96, and the + is 96+96 -> 96, the := is 96 -> 64

The former will just be MACCS instructions - the latter will be more expensive...

henkmuller commented 12 years ago

By the way, how many points in your FFT? 4096? (that is 16K worth of data - about as much as a core can cope with)

henkmuller commented 12 years ago

A first version FFT is in module_fft_simple. There are a number of obvious optimisations that I need to implement next. This version multiplies the FFT by N which needs repairing (but it gives a high precision answer as a result)

henkmuller commented 12 years ago

I am not sure how to compute a fair dBFS figure, but you probably can if I give you the numbers For a 1024 point FFT followed by an inverse FFT with random input in the range 0..0xFFFFFF (ie, 24 bit positive numbers) I get a sum of squared errors over both imaginary and real part of 1425554. This seems like a 5-bit error on a 24-bit signal, or 19 bits precision.

henkmuller commented 12 years ago

The current version is reasonably close to optimal speed (for this algorithm) and has reasonable accuracy. Either can be improved at a cost to the other...

Bottom line: about 1 bit of error for every factor 4 in the number of points, so 1024 points has 5-6 bits of error. 4096 points will have 6-7 bits of error. With a 24 bit signal you will have 16-17 bits of accuracy in your result, which is around 6_17 to 6_17 = -96 to -102 dB.

lilltroll77 commented 12 years ago

I will pull and take a look!