wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
4.08k stars 1.07k forks source link

Using hamming window for Paraformer frontend. #2549

Closed TeaPoly closed 3 months ago

TeaPoly commented 3 months ago

The default window in the Paraformer frontend is hamming. We can find more details here. However, the default window in kaldi.fbank is povey, as specified here. This different window maybe a little mismatch. As mentioned in line 44 of this document:

"povey" is a window I made to be similar to Hamming but to go to zero at the edges, it's pow((0.5 - 0.5cos(n/N2*pi)), 0.85) I just don't think the Hamming window makes sense as a windowing function.