-
See if NoiseSuppressor makes ASR more robust wrt noise.
[https://developer.android.com/reference/android/media/audiofx/NoiseSuppressor](https://developer.android.com/reference/android/media/audiofx/N…
-
I have some background noise wav files in my testset, in order to test the robustness of models.
But I find it would remove the empty examples in bin/asr/eval.py when I remove lines 114-115 in "uti…
-
File "/opt/whisper/whisper-at/src/noise_robust_asr/intermediate_feat_extract/as_full/extract_as_full_whisper_all.py", line 35, in extract_audio
_, audio_rep = mdl.transcribe_audio(wav)
File …
-
Hi. I tested the model with the inference jupyter file your provided. It's amazing that the model can still generate good voice even if a Mandarin source file is fed as input.
However, I notice that…
-
您好,我在您的代码里看到了有cnn_tdnn的脚本,想请教一下在训练声学模型的时候是否可以替换为该脚本,因为最近在尝试chime6的相关内容,对后端了解不深。谢谢
-
I use “time_pooling = nn.AvgPool2d((60,1))” for whisper large pre-trained model(encoder out size is [batch,1500,1280])as Temporal Pooling Layer, but for 'last_mlp' and 'last_tr' methods cannot achiev…
-
I have created [fork](https://github.com/marcinmatys/whisper_streaming/blob/main/README2.md) of whisper_streaming , so I took the liberty of writing about it here.
We may close this issue soon as it…
-
您好,我看了徐勇2015年的博士毕业论文,里面噪声告知训练的部分,在连续7帧即(7,257)的样本中通过取连续几帧的带噪语音的平均功率作为估计的平均噪声功率,这个依据是什么有些看不明白这部分的内容
-
Hi Yuan,
The features of the ESC dataset you provided seem to only have whisper-large-v1,But it seems that the provided code includes features from more than one model.
Thanks
-
Have you tried building the spectrogram and encoder output in smaller chunks and appending? I think the spectrogram should generate fairly easily with minimal noise depending on the size of the chunk,…