githubharald / CTCDecoder

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
https://towardsdatascience.com/3797e43a86c
MIT License
817 stars 182 forks source link

Your beamsearch decoder never gives result like aa or ll #10

Closed menglin0320 closed 5 years ago

menglin0320 commented 5 years ago

refering to https://github.com/githubharald/CTCDecoder/blob/master/src/BeamSearch.py There seems to be a bug, I'm trying to debug now

githubharald commented 5 years ago

in case you think there is a bug, you have to provide the following data to make the problem re-producible:

menglin0320 commented 5 years ago

I tested it with this example classes = 'ab' mat = np.array([[0.8, 0, 0.2], [0.4, 0.0, 0.6], [0.8, 0, 0.2]])

And I realize the implementation is correct.

It's not a bug it's just the fact that this algorithm discourages same sequences, aka for the example above 'a' corresponds to 6 sequences: aaa,-aa,aa-,a--,--a,-a- but 'aa' only corresponds to 'a-a'. The model needs to be very well trained to be able to give result 'aa'.

menglin0320 commented 5 years ago

Hmmm, this problem mitigates if every character in the sequence is wide. need to enlarge the feature map to lstm I guess..