kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
416 stars 89 forks source link

some insights required on Hotword - Boosting #56

Closed spranjal25 closed 2 years ago

spranjal25 commented 2 years ago

Hi, I was wondering if there's some paper or documentation out there about PyCTCdecode that explains the hotword boosting? I want certain insights on how exactly does the 'hotword_weight' changes my output, does that provide a weighted search in beams containing the hotwords more prominently? Any help is appreciated, Thanks!

mpierrau commented 2 years ago

Hi! The creators elaborate on this in a talk they held at a Hugginface event! :)

gkucsko commented 2 years ago

Hi, the easiest is probably to look at what the code does (https://github.com/kensho-technologies/pyctcdecode/blob/main/pyctcdecode/decoder.py#L300) Essentially it shallow fuses (adds with weight) an additional score to the language model score, meaning it boosts the LM provided scores for hypotheses that contain a hotword. There is also some partial scoring implemented to make sure that beams containing a hotword don't get pruned before finishing the decoding. Hope that helps!