sooftware / kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
https://sooftware.github.io/kospeech/
Apache License 2.0
603 stars 191 forks source link

repetition_remove in CER calculation for CTC model #133

Closed kelvinqin closed 3 years ago

kelvinqin commented 3 years ago

Title

Description

Usually, CTC model will generate some character repetition, I did not see any repetition removing when you calculate CER, Please see this is another defect,

Thanks!

Linked Issues

sooftware commented 3 years ago

Check https://github.com/sooftware/KoSpeech/blob/4d8ddf1322b99f704af92d3e15c274c96972a801/kospeech/vocabs/ksponspeech.py#L78