Closed robolamp closed 2 years ago
Hello!
Today I found out that Mel Spectrogram Frequency augmentation (FrequencyMaskingAug) defined here: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/frequency_masking.py#L8 is applying only to the beginning of the spectrogram. For some reason, the length of this mask was equal to the number of mel frequency channel instead of full length of the audio.
I suppose that it is caused because of using len() function instead of data.shape[1] do determine number of time points of the audio in this line: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/spectrogram_augmenter.py#L53
len()
data.shape[1]
UPD: I'm suggesting a small PR which is fixing this problem (ofc if I correctly understood what's happening there).
Hello!
Today I found out that Mel Spectrogram Frequency augmentation (FrequencyMaskingAug) defined here: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/frequency_masking.py#L8 is applying only to the beginning of the spectrogram. For some reason, the length of this mask was equal to the number of mel frequency channel instead of full length of the audio.
I suppose that it is caused because of using
len()
function instead ofdata.shape[1]
do determine number of time points of the audio in this line: https://github.com/makcedward/nlpaug/blob/master/nlpaug/augmenter/spectrogram/spectrogram_augmenter.py#L53UPD: I'm suggesting a small PR which is fixing this problem (ofc if I correctly understood what's happening there).