Closed jjjjohnson closed 3 years ago
Hi, junjie This specaug implemented by module is a invalid version (such as there are uncertain bugs) and the version implemented in dataset is recommended to use.
在 2021年8月4日,下午7:24,JUNJIE JIN @.***> 写道:
Hi @Snowdar Since the inputs are of shape [batch, frequency, time], this line https://github.com/Snowdar/asv-subtools/blob/1ea98945b50b6401fd28627e760f6e66a08a22e7/pytorch/libs/nnet/dropout.py#L217 should be: inputs[:,f_0:f0+f,:].fill(0.)
Is that correct? Thanks Junjie
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Thanks for your replay!
I have another question : why only the output of frequency mask multiplied inverted_factor
, while time mask not do the same?
https://github.com/Snowdar/asv-subtools/blob/1ea98945b50b6401fd28627e760f6e66a08a22e7/pytorch/libs/egs/augmentation.py#L84
Applying the scale for frequency is to keep the similar mean of channel (like dropout) and we do not pay attention to time dimension for it’s channel-independent. Actually, if you scale the time dimension, you may get a little worse result.
在 2021年8月5日,上午10:00,JUNJIE JIN @.***> 写道:
Thanks for your replay! I have another question : why only the output of frequency mask multiplied inverted_factor, while time mask not do the same? https://github.com/Snowdar/asv-subtools/blob/1ea98945b50b6401fd28627e760f6e66a08a22e7/pytorch/libs/egs/augmentation.py#L84
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Cool! Thanks for the clarification!
Hi @Snowdar Since the inputs are of shape [batch, frequency, time], this line https://github.com/Snowdar/asv-subtools/blob/1ea98945b50b6401fd28627e760f6e66a08a22e7/pytorch/libs/nnet/dropout.py#L217 should be:
inputs[:,f_0:f_0+f,:].fill_(0.)
Is that correct? Thanks Junjie