espnet / notebook

63 stars 40 forks source link

ASR training using ESPnet2 library calls #16

Open Tirthankar-iiitb opened 3 years ago

Tirthankar-iiitb commented 3 years ago

Hi - I am looking for an example notebook where I want to train an ASR on a dataset such as TIMIT using ESPnet2 library calls. The data preparation is required to be done separately in Python (not using recipes) for 'sound' or 'npy' (not Kaldi style) as would be required. Any pointer on the training part would be helpful./Tirthankar.

Here is my experiment but this is giving error on epoch 1 training. timit_train_espnet2.md

sw005320 commented 3 years ago

Thanks. We actually do not prepare an espnet2 ASR example, so it would be great if you make it work and report it.

According to the log file you attached, it seems that the input data has some issues. Your current setup of espnet2 assumes the time domain waveform (16k sampling) instead of speech features. Could you check it? Also, you would need to compute the mean and variance statistics.

Tirthankar-iiitb commented 3 years ago

Thanks for your revert. Have noted your inputs and will review and run to make this work.

On 15-Apr-2021, at 1:21 AM, Shinji Watanabe @.**@.>> wrote:

Thanks. We actually do not prepare an espnet2 ASR example, so it would be great if you make it work and report it.

According to the log file you attached, it seems that the input data has some issues. Your current setup of espnet2 assumes the time domain waveform (16k sampling) instead of speech features. Could you check it? Also, you would need to compute the mean and variance statistics.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/espnet/notebook/issues/16#issuecomment-819789367, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQ2ZCM5PXE7I24GP55SNZK3TIXW5FANCNFSM425XGBTA.