makcedward / nlpaug

Data augmentation for NLP
https://makcedward.github.io/
MIT License
4.44k stars 463 forks source link

An issue with FrequencyMaskingAug #258

Closed robolamp closed 2 years ago

robolamp commented 2 years ago

Hello!

Today I found out that Mel Spectrogram Frequency augmentation (FrequencyMaskingAug) defined here: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/frequency_masking.py#L8 is applying only to the beginning of the spectrogram. For some reason, the length of this mask was equal to the number of mel frequency channel instead of full length of the audio.

I suppose that it is caused because of using len() function instead of data.shape[1] do determine number of time points of the audio in this line: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/spectrogram_augmenter.py#L53

UPD: I'm suggesting a small PR which is fixing this problem (ofc if I correctly understood what's happening there).