Emrys365 / espnet

End-to-End Speech Processing Toolkit
https://espnet.github.io/espnet/
Apache License 2.0
7 stars 2 forks source link

Pretrained Model #15

Closed quancs closed 1 year ago

quancs commented 1 year ago

Hello, I am currently trying to compare my front-end model with the one in paper End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming. I want to know if there are any pretrained models for that paper.

Emrys365 commented 1 year ago

@quancs I'm sorry I didn't notice this thread. Unfortunately I have deleted all the checkpoints several months ago to free up the disk space.

But you can reproduce my experiments by following the instructions in https://github.com/Emrys365/espnet/tree/wsj1_mix_spatialized/egs/wsj1_mix_spatialized/asr1.

In addition, I would recommend you to compare with the model in the followup paper End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend, which largely mitigates the numerical instability and improves the performance of the model.

You can also reproduce the experiments in this paper by following the recipe in https://github.com/Emrys365/espnet/tree/numerical_stability/egs/wsj1_mix_spatialized/asr1.

Alternatively, you can check the journal version of the aforementioned two papers: End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party. It includes results on more datasets.

Let me know if you need further information about the paper or help on reproducing the experiments.

quancs commented 1 year ago

OK. I will try. Thank your patient response. ^_^