Closed kba closed 5 years ago
Good question, I think the make test
output should contain the following pattern:
DEBUG ocrd_typegroups_classifier - Detected fraktur
If @seuretm agrees, I'll wait for the updated sample model (with maybe updated class names in English, as I understood @seuretm) and see if I can get check for the result in make test
. This would make the test complete.
I don't think it makes sense to check for the exact classification result, just the coarse result "fraktur".
Or maybe just check the XML result ;-)
So, with the current neural network, the output to expect is this:
14:17:37.216 DEBUG ocrd_typegroups_classifier - Detected fraktur:27, textura:23, antiqua:21, rotunda:17, bastarda:11, schwabacher:2
On the CPU of my laptop, the run time is roughly 11 seconds. Using a recent GPU can make it 30 or 40 times faster.
I agree with you, @mikegerber , for this specific page, we can consider for now that the output is correct if it contains "Detected fraktur:", regardless of the score or the end of the line (which might change if the model is replaced, while we would expect the same top-1 result).
Also, the German and English names have much overlapping:
translation = {
'griechisch': 'greek',
'hebräisch': 'hebrew',
'kursiv': 'italic',
'andere_schrift': 'other_font',
'nicht_schrift': 'not_a_font'
}
We could as well consider any output containing a German name to be wrong.
What is the result you expect from running the test.sh script? So we can compare :)