Open BingqingQu opened 6 years ago
Is your code complete? It looks that the variables out
and timestamp
are not used further...
Can you give more information about the output format? I see that the files have always 156 lines with several probalities, but none of these values seem to be equal the ones which are outputed with --probabilities
.
https://github.com/tmbdev/ocropy/wiki/OCRopus-File-Formats#lattice-files This format was used in ocropy 0.6.
@amitdo The outputed files look differently. Here is an example:
His patch just outputs the raw result of the prediction.
What you see with the current (without this parch) text/prob. options is the 'best' path that translate_back() found for you.
The format in my link is more human readable. I was not very clear in my previous comment, sorry about that.
Related: #25
The number of lines (156) is the size of the codec (chars) in the model you use.
Okay, I don't think that this matrix is then enough interesting for an option to ocropus-rpred
. One can use ocrolib
as a library for such computations. More advanced lattice/alternative calculations could be interesting as outlined in #186.
There is also the --save
and --show
option for a visual debug info about these matrix.
This is intended to be an extension of the --probabilities. Instead of just printing the probabilities for the recognised characters, --probmat will compute the complete probability matrix.
At each "timestep" the probability for each character is computed. This can/could be used as input to a language model for example where one would have access to the probabilities of other characters as well.