Closed Tedonze closed 3 years ago
best would be to retrain the model with number-only training data.
the easiest solution would of course be to recognize the text with the current model and then remove all non-digit characters from the output. but no idea how good that would perform, you would have to try.
yes yes it is the same solution I want to test. Now I try to find this output (where can I find this tensor please in your model?) and implement my own decoder that would take the max probability only for digit characters
output of softmax (tensor of shape TxBxC, T=x-coordinate, B=batch element index, C=chars): https://github.com/githubharald/SimpleHTR/blob/master/src/model.py#L146
add it to eval_list: https://github.com/githubharald/SimpleHTR/blob/master/src/model.py#L259
and here the softmax output (eval_res[0], a numpy array) is handed to a custom decoder, so here you can instead put your decoder: https://github.com/githubharald/SimpleHTR/blob/master/src/model.py#L281
you can try either best path decoding and only take the max scoring digit, or do standard best path decoding and remove non-digit chars at the end.
thank you so much .I will do my best
Please I want to reduce the charList predicted by the model to 0123456789 . I want to specialise this model for number recognition. Please some ideas?