kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
415 stars 89 forks source link

Fix bug with scoring with end of statement #96

Closed lopez86 closed 1 year ago

lopez86 commented 1 year ago

There is a minor bug where the cache does not check if is_eos is set, so the score for some text can already be set when _get_lm_score with is_eos=True is called. This leads to the score not including any other scoring logic for is_eos=True. This PR makes the LM cache use the text and the value of is_eos in the key to avoid this issue.