ksanjeevan / crnn-audio-classification

UrbanSound classification using Convolutional Recurrent Networks in PyTorch
MIT License
383 stars 80 forks source link

questions about transforms #20

Closed zhaoyun-ai closed 3 years ago

zhaoyun-ai commented 3 years ago

Hi, thanks for your excellent work. I noticed that there is a class called 'class ImageTransforms' in transforms.py I want to know if this is an image transformation operation on the spectrogram In other words, I want to know whether the image transformation is applicable to the spectrogram?

ksanjeevan commented 3 years ago

Hi @zhaoyun-ai, that's my mistake, this class was for another net/data.

In general, we have to be careful what type of "image transforms" we apply to spectrograms. For example, doing stretching for an image with linear interpolation might be valid, while it won't be for spectrograms. On the other hand adding gaussian noise will be valid for both.

You can find some valid spectrogram augmentation in pytorch/audio.

zhaoyun-ai commented 3 years ago

Thank you for your reply, I will close this question