KinWaiCheuk / nnAudio

Audio processing by using pytorch 1D convolution network
MIT License
1.02k stars 89 forks source link

CQT and Log Magnitude #30

Closed mcallistertyler closed 4 years ago

mcallistertyler commented 4 years ago

Hi,

Thank you for this library it has been really straightforward to use in helping me migrate a Pytorch project using Mel-spectrogram to CQT.

Unfortunately my knowledge on spectrogram representations is essentially non-existent. I'd like to know if the spectrogram returned from the CQT1992v2 class is a 'log magnitude' of the spectrogram when the output_format is Magnitude?

I see the terms magnitude and log magnitude used frequently in research papers so I'm not sure if this is a case of them being used interchangeably or whether there is a difference.

Thanks

KinWaiCheuk commented 4 years ago

Hi mcallistertyler95, thanks for your question.

The spectrograms returned by CQT1992v2 and other layers such as STFT or Melspectrogram are just magnitude, not log magnitude. To make your output log magnitude, you can do something like this

CQT_layer = Spectrogram.CQT1992v2(fs, stride, fmin, fmax=None, n_bins=n_bins,
                        bins_per_octave=bins, norm=1,center=True,
                        pad_mode='reflect', device=device) # Initializing conv1d for CQT 1992 version
cqt_torch = CQT_layer(torch.tensor(x).to(device)) # Feed forward your input x
cqt_torch = torch.log(cqt_torch) # Convert magnitude to log magnitude

Also regarding your question

I see the terms magnitude and log magnitude used frequently in research papers so I'm not sure if this is a case of them being used interchangeably or whether there is a difference.

They are different, magnitude and log magnitude are not the same. Usually magnitude refers to the output we obtained directly from STFT and other time-frequency conversion algorithms.

You can consider taking log on the magnitude of the spectrogram as a normalization technique. Please see the picture below to observe the difference between before and after taking log on the same output. index

mcallistertyler commented 4 years ago

Thanks for this clear explanation :)