Closed menglin0320 closed 5 years ago
in case you think there is a bug, you have to provide the following data to make the problem re-producible:
I tested it with this example classes = 'ab' mat = np.array([[0.8, 0, 0.2], [0.4, 0.0, 0.6], [0.8, 0, 0.2]])
And I realize the implementation is correct.
It's not a bug it's just the fact that this algorithm discourages same sequences, aka for the example above 'a' corresponds to 6 sequences: aaa,-aa,aa-,a--,--a,-a- but 'aa' only corresponds to 'a-a'. The model needs to be very well trained to be able to give result 'aa'.
Hmmm, this problem mitigates if every character in the sequence is wide. need to enlarge the feature map to lstm I guess..
refering to https://github.com/githubharald/CTCDecoder/blob/master/src/BeamSearch.py There seems to be a bug, I'm trying to debug now