kensho-technologies / pyctcdecode

A fast and lightweight python-based CTC beam search decoder for speech recognition.
Apache License 2.0
419 stars 89 forks source link

How to get the timestamps? #8

Closed khursani8 closed 3 years ago

khursani8 commented 3 years ago

Hi, I wondering how to get the timestamps for my decoded string

Thanks

poneill commented 3 years ago

Hi,

You can get logical timestamps for words by something like:

beams = decoder.decode_beams(logits)
top_beam = beams[0]
transcript, lm_state, indices, logit_score, lm_score = top_beam

indices indexes back into the logit matrix and tells you which span of logit frames is associated with each word. If you know the absolute timing of your logit matrix, you can back out timestamps from that.

khursani8 commented 3 years ago

Thanks, I will try it later