mesolitica / malaya-speech

Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/
https://malaya-speech.readthedocs.io/
MIT License
236 stars 42 forks source link

Ask for tips in conformer model training #24

Closed mr-coconut closed 2 years ago

mr-coconut commented 2 years ago

Hi~~~ Do you perhaps have any tips in model training on the telephony speech datasets? Like, Do we need noise reduction/speech enhancement on telephony data before inputting data into the training process? How long should the training data be in general? How many epochs/rounds are typically needed? Thanks~~~

huseinzol05 commented 2 years ago

u should do noise augmentation to get wider audio definition, eg,

  1. if original sr is 16k, reduce it to 4k, upsampling back to 16k
  2. room echo
  3. loudness

or can try, https://iver56.github.io/audiomentations/

I trained those models until test loss hit plateau.