Ask for tips in conformer model training

mesolitica / malaya-speech

Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/

https://malaya-speech.readthedocs.io/

MIT License

236 stars 42 forks source link

Ask for tips in conformer model training #24

Closed mr-coconut closed 2 years ago

mr-coconut commented 2 years ago

Hi~~~ Do you perhaps have any tips in model training on the telephony speech datasets? Like, Do we need noise reduction/speech enhancement on telephony data before inputting data into the training process? How long should the training data be in general? How many epochs/rounds are typically needed? Thanks~~~

huseinzol05 commented 2 years ago

u should do noise augmentation to get wider audio definition, eg,

if original sr is 16k, reduce it to 4k, upsampling back to 16k
room echo
loudness

or can try, https://iver56.github.io/audiomentations/

I trained those models until test loss hit plateau.