Closed naymaraq closed 1 year ago
We may add that flag eventually, but it is not on the immediate plan. For now we just remove any unwanted tokens from the transcript themselves.
@qmac In paper (https://arxiv.org/pdf/2104.11348v3.pdf), the reported WER is 11.3. Does this include filler words? Is there any script that I can use to reproduce paper result using Rev .nlp output files (https://github.com/revdotcom/speech-datasets/tree/main/earnings21/output/rev) ?
@naymaraq Yes it does include filler words. Let me see if we can find that script.
Hi
Do you plan to add a flag to disable filler words (like um, uh)?