githubharald / CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
https://towardsdatascience.com/b051d28f3d2e
MIT License
557 stars 160 forks source link

Why are not used log(probabilities)? #36

Closed janvainer closed 4 years ago

janvainer commented 4 years ago

Thank you for this awesome repo! ;) I was wondering why are not used log probabilities? Is the beam search stable even for long sequences?

githubharald commented 4 years ago
  1. for my use-case (text recognition), I usually had something around 100 time-steps, for which I did not run into numerical issues
  2. there already was a discussion about it, maybe the changes are already implemented in the fork, see: https://github.com/githubharald/CTCWordBeamSearch/issues/13
  3. if you're comfortable with C++, it should no be too difficult to implement the changes
janvainer commented 4 years ago

Thanks for response, I will look into the forked repo. :)

weinman commented 4 years ago

@LordOfLuck FYI, I never did make any log-space changes. But as @githubharald says, the conversion process would be moderately straightforward

janvainer commented 4 years ago

@weinman Thank you for the info, I am testing the code on long sequences now. If the decoding fails for my case, I may implement the log-space operations in the future