Questions regarding FFT Api (DSP-95)

Galfy1 commented 1 year ago

First of all thanks for providing this API. I want to use the FFT api on an esp32-s3 but there are some things that I struggle to understand.

I will be referencing to pages from this article: https://www.sjsu.edu/people/burford.furman/docs/me120/FFT_tutorial_NI.pdf

Here is a list of my questions

Why are dsps_cplx2reC and cplx2real used? What do they do? and what is the difference between them? There is conflicting documentation about this. For example, a comment in the FFT examples states that dsps_cplx2 returns two complex vectors, but the ESP-DSP Library documentation states that it returns two real arrays. Why can’t I just take the magnitude of the complex result directly after doing the FFT and bit reversal?
How to use the FFT with real input values? I get that i can set all the imaginary parts to 0, but this is not what is done in the fft4real example. Here all the values are simply concatenated directly after each other without “padding 0’s” in the imaginary parts (see code snippet bellow). In the fft4real example dsps_cplx2real is also used instead of dsps_cplx2reC (referring back to question 1).
```
// (Snippet from fft4real example):

// Convert two input vectors to one complex vector
for (int i=0 ; i< N ; i++)
{
    x1[i] = x1[i] * wind[i];
    x2[i] = x1[i];
}
```



3.  **Is the output of the FFT single or two-sided?** On the bottom of page 2 in the article I linked above, is it stated that all values located between 1 and N/2-1 on the frequency axis should be doubled in value, to convert it from from two-sided to single-sided. And all values after N/2 should be tossed away. But this is not done in the ESP-DSP FFT examples. I don't understand this.

4. **Is the output of the FFT amplitude or power?** On page 6 in the article I linked above, it is stated that 10 * log10 is used for power and 20 * log10 is used for amplitude when converting from linear scale to log scale. I would think the output of the FFT would be amplitude, but 10*log10 is used is used in the ESP-DSP FFT examples as if the output of the fft was Power. I don't understand this.

I would love to get some clearance on these questions.

best regards,
Galfy1

dmitry1945 commented 1 year ago

Hi @Galfy1 I will answer one-by-one.

For all calculation the complex FFT will be used. It means, for complex input signal we have real part of the signal, imag part of the signal (re and im independent values), we make FFT, then bit reverse, and finally we have complexc spectrum. By analyzing the FFT we will find, that we easyly can use a complex FFT to make an FFT for two real signals, by placing the first signal as real part of input signal, and second signal as imag part of the input signal. It means, you take one signal and copy it to real part of complex signal, second signal to imag part, make complex FFT on complex signal, bit reverse. Here we have some complex spectrum of the input complex signal. And, from this complex spectrum, we need to extract spectrum for the firs real signal, and for the second real signal. To do this, we use dsps_cplx2reC. For example, you have a stereo audio signal, and you whant to make FFT for right and for left channel. We place right as real, left as imag, make FFT, bit reverce, and dsps_cplx2reC. Finally we have spectrum for left and for right.

Second case: we have single real input signal with length N, and we need to make FFT for this input. We can make a complex signal length N and copy input signal as real part, and 0 as imag part, then make FFT, bit reverce and that's it. But, in this case result complex spectrum 0..N/2 will be almost the same as N/2...N. It means we did some useless calculations. To avoid this situation, we can make a FFT N/2 and add additional operation dsps_cplx2real that will finalize spectrum for input real signal. We copy input signal N llength to real and to imag part of complex signal length N/2, iWe will use complex FFT N/w, but we will do next: FFT N/w ->bit reverce N/w ->dsps_cplx2real. At output of this operation we will have complex spectrum of real input signal length N.

dmitry1945 commented 1 year ago

The property of real input that complex spectrum at ouput from N/2...N almost repeat samples from 0..N/2, that's why we need only complex samples from 0 to N/2.

For FFT for real input you make: In case of radix 2:

    dsps_fft2r_fc32(x1, N>>1);
    // Bit reverse 
    dsps_bit_rev2r_fc32(x1, N>>1);
    // Convert one complex vector with length N/2 to one real spectrum vector with length N/2
    dsps_cplx2real_fc32(x1, N>>1);

Here, finally in the x1 you will have complex spectrum of your input real signal.

In case of radix-4

    // FFT Radix-4
    unsigned int start_r4 = dsp_get_cpu_cycle_count();
    dsps_fft4r_fc32(x2, N>>1);
    // Bit reverse 
    dsps_bit_rev4r_fc32(x2, N>>1);
    // Convert one complex vector with length N/2 to one real spectrum vector with length N/2
    dsps_cplx2real_fc32(x2, N>>1);

Here the same, in x2 you will have complex spectrum length N/2 for input real signal length N.

dmitry1945 commented 1 year ago

3 and 4. The output of FFT is a full spectrum with complex values. To get amplitude you should make sqrt(rere + imim). If you use dsps_cplx2reC or dsps_cplx2real_fc32 you will have only single side.

dmitry1945 commented 1 year ago

regards, Dmitry

Galfy1 commented 1 year ago

I forget to thank you for your in depth answer! Thanks!

espressif / esp-dsp

Questions regarding FFT Api (DSP-95) #53