Implementation of "An open-source end-to-end ASR system for Brazilian Portuguese using DNNs built from newly assembled corpora" by Igor Quintanilha, Luiz Wagner Pereira Biscainho, and Sergio Lima Netto. (submitted).
All datasets can be found here.
AM | Trained on | Method | WER | Download |
---|---|---|---|---|
DeepSpeech 2 | BRSD v2 | Scratch | 52.55% (2.42%) | Link |
DeepSpeech 2 | BRSD v2 | Fine-tuned | 47.41% (1.73%) | Link |
Language model* | RP | Size | LapsBM | BRTD |
---|---|---|---|---|
word 3-gram | 25 | 1.9G | 173.79 | 161.29 |
word 5-gram | 42 | 7.8G | 136.50 | 135.12 |
char 5-gram | 5 | 41M | <=2,334.48 | <=2,694.51 |
char 10-gram | 10 | 4.7G | <=271.86$ | <=323.71 |
char 15-gram* | 15 | 5.4G | <=239.59$ | <=198.49 |
char 20-gram* | 20 | 8.8G | <=227.84$ | <=189.53 |
*All models were trained using KenLM. More detailed information in the paper.