ar1st0crat / NWaves

.NET DSP library with a lot of audio processing functions
MIT License
453 stars 71 forks source link

Stft output frequency bin count doesn't match Stft constructor parameters? #31

Closed Bambofy closed 3 years ago

Bambofy commented 3 years ago

Hi!

I'm currently using the Stft feature, when i provide these parameters

Stft stft = new Stft( windowSize: 1024, hopSize: 256, window: NWaves.Windows.WindowTypes.Hann, fftSize: 256);

The output spectrums result in a length 513 not 256?

See the image below that displays the number of bins in a single spectrum of the Stft spectrogram

Screenshot 2020-09-14 at 09 24 38

Additionally: Q1) What is the resulting Y Axis graph representing, is it power? Q2) The Y values output from the Stft are normalized, how are they normalized? e.g:

Screenshot 2020-09-14 at 09 32 29
ar1st0crat commented 3 years ago

Hi!

Regarding the STFT parameters - take a look here at n_fft and win_length parameters.

In short, the FFT size can not be smaller than the size of analysis window (so NWaves sets FFT size to 1024 automatically in your case). I guess what you're trying to do is analyze signal in the 1024-samples window and then obtain FFT with lower frequency resolution (256 samples). So essentially, you need to compute 1024-point FFT and then sum values inside each group of 4 adjacent samples. Something like this:

for (int i = 1, k = 0; i < s.Length; i+=4, k++)
    low[k] = s[i] + s[i+1] + s[i+2] + s[i+3]

Q1: yes, Y represents power (spectrogram is the sequence of power spectra). Q2: each power spectrum is normalized by FFT size (by default). Unfortunately, I didn't add the boolean parameter for normalization in Spectrogram() method (Fft.PowerSpectrum() has this parameter, though). This parameter will be added in next version of the lib.

Regards, Tim

Bambofy commented 3 years ago

Hi!

Regarding the STFT parameters - take a look here at n_fft and win_length parameters.

In short, the FFT size can not be smaller than the size of analysis window (so NWaves sets FFT size to 1024 automatically in your case). I guess what you're trying to do is analyze signal in the 1024-samples window and then obtain FFT with lower frequency resolution (256 samples). So essentially, you need to compute 1024-point FFT and then sum values inside each group of 4 adjacent samples. Something like this:

for (int i = 1, k = 0; i < s.Length; i+=4, k++)
    low[k] = s[i] + s[i+1] + s[i+2] + s[i+3]

Q1: yes, Y represents power (spectrogram is the sequence of power spectra). Q2: each power spectrum is normalized by FFT size (by default). Unfortunately, I didn't add the boolean parameter for normalization in Spectrogram() method (Fft.PowerSpectrum() has this parameter, though). This parameter will be added in next version of the lib.

Regards, Tim

Thank you for the detailed reply!