bzamecnik / tfr

Spectral audio feature extraction using time-frequency reassignment
MIT License
40 stars 13 forks source link

Terminology + plot #26

Closed josephernest closed 6 years ago

josephernest commented 6 years ago

What difference do you make between pitchgram and spectrogram? Isn't it a plot of magnitude of STFT in both cases? Same question for reassigned pitchgram vs. reassigned spectrogram, what is the difference?

Last: how would you plot the best the spectrogram?

This works, but it's not optimal (i.e. the window is too small, it's not zoomed, etc.):

import tfr
import matplotlib.pyplot as plt
signal_frames = tfr.SignalFrames('test.wav')
P = tfr.pitchgram(signal_frames)
plt.imshow(P.transpose())
plt.show()

Do you think you could add in the README a ready to use Spectrogram/Pitchgram plot code?

bzamecnik commented 6 years ago

Hi Joseph, spectrogram is a matrix where in x axis are overlapping time intervals and in y the FFT bins (linear frequency). In reassigned/requantized spectrogram the time bins are not overlapping. In pitchgram the frequency is transformed to (logarithmic) and quantized to pitch bins (possibly subdivided). Thus pitchgram is more musically relevant.

plt.imshow might be ok, except you can adjust varous parameters tomake it more readable. You can also check librosa plotting fuctions, which are just customized calls to imshow + ticks.

Yeah, we could improve the readme...

Kind regards,

Bohumir

Dne st 26. 9. 2018 19:05 uživatel josephernest notifications@github.com napsal:

What difference do you make between pitchgram and spectrogram? Isn't it a plot of magnitude of STFT in both cases? Same question for reassigned pitchgram vs. reassigned spectrogram, what is the difference?

Last: how would you plot the best the spectrogram?

This works, but it's not optimal (i.e. the window is too small, it's not zoomed, etc.):

import tfr import matplotlib.pyplot as plt signal_frames = tfr.SignalFrames('test.wav') P = tfr.pitchgram(signal_frames) plt.imshow(P.transpose()) plt.show()

Do you think you could add in the README a ready to use Spectrogram/Pitchgram plot code?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bzamecnik/tfr/issues/26, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbOrDl7BSCgbm_8fC5RoQXlaEcb5iIMks5ue7POgaJpZM4W7Gxt .

josephernest commented 6 years ago

Oh ok, pitchgram has a log frequency scale (a bit like Constant-Q transform?)...

bzamecnik commented 6 years ago

Yeah, but it solves some problems with CQT. I used CQT before and this was superior.

Dne st 26. 9. 2018 19:45 uživatel josephernest notifications@github.com napsal:

Oh ok, pitchgram has a log frequency scale (a bit like Constant-Q transform?)...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bzamecnik/tfr/issues/26#issuecomment-424807646, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbOrPisLd9Y_Q-U6-Xw9wMQbJ0sVB_Aks5ue70cgaJpZM4W7Gxt .

josephernest commented 6 years ago

Ok thanks! Last question (and then I'll study this by myself): with standard STFT, it's possible to STFT a signal, modify / do some masking on the S[t,w] matrix, then perform an iSTFT and get a modified signal. This is very often used in denoising, etc.

Is there something similar with Reassigned spectrogram? i.e. you perform a ReassignedSTFT, then you get a matrix, you modify/mask some values in some time frames, and then inverseReassignedSTFT to get the modified signal? If possible, would you have an example for this?

bzamecnik commented 6 years ago

Yeah, you're asking if the transform is invertible. I'm afraid directly not, but maybe there's some lossy transform. Maybe some research tried to solve invertible variants of reassigned STFT. Anyway, I remember NSGT (non-stationary Gabor transform) which was overcomplete and invertible. But I don't have practical experience with it yet.

Hope it helps.

Dne st 26. 9. 2018 19:49 uživatel josephernest notifications@github.com napsal:

Ok thanks! Last question (and then I'll study this by myself): with standard STFT, it's possible to STFT a signal, modify / do some masking on the S[t,w] matrix, then perform an iSTFT and get a modified signal. This is very often used in denoising, etc.

Is there something similar with Reassigned spectrogram? i.e. you perform a ReassignedSTFT, then you get a matrix, you modify/mask some values in some time frames, and then inverseReassignedSTFT to get the modified signal? If possible, would you have an example for this?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bzamecnik/tfr/issues/26#issuecomment-424809221, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbOrNu-Nhc1brKh2WBnc75LNt5raMiQks5ue75EgaJpZM4W7Gxt .

josephernest commented 6 years ago

Thanks. IIRC:

Our CQ-NSGT algorithm is not limited to linear frequency scales (as is the STFT), but can be used with arbitrary scales, most notably with logarithmic (aka “constant-Q”) scales relevant for musical applications, or also perceptually informed Mel scales.

https://grrrr.org/research/software/nsgt/

Also, if i remember well, with NSGT or CQ, even if it changes the frequency bins scale, you still have the classical limitations time domain resolution vs. freq domain resolution.

But "STFT with reassignement" seems to be another world, which much more precise localization of frequencies, is it? Would you have reference about this:

Yeah, you're asking if the transform is invertible. I'm afraid directly not, but maybe there's some lossy transform

Even if it's lossy, I'd be interested in an invertible reassigned STFT