seuretm / ocrd_typegroups_classifier

Apache License 2.0
10 stars 1 forks source link

Expected results of the test #3

Closed kba closed 5 years ago

kba commented 5 years ago

What is the result you expect from running the test.sh script? So we can compare :)

mikegerber commented 5 years ago

Good question, I think the make test output should contain the following pattern:

DEBUG ocrd_typegroups_classifier - Detected fraktur

If @seuretm agrees, I'll wait for the updated sample model (with maybe updated class names in English, as I understood @seuretm) and see if I can get check for the result in make test. This would make the test complete.

I don't think it makes sense to check for the exact classification result, just the coarse result "fraktur".

mikegerber commented 5 years ago

Or maybe just check the XML result ;-)

seuretm commented 5 years ago

So, with the current neural network, the output to expect is this: 14:17:37.216 DEBUG ocrd_typegroups_classifier - Detected fraktur:27, textura:23, antiqua:21, rotunda:17, bastarda:11, schwabacher:2 On the CPU of my laptop, the run time is roughly 11 seconds. Using a recent GPU can make it 30 or 40 times faster.

I agree with you, @mikegerber , for this specific page, we can consider for now that the output is correct if it contains "Detected fraktur:", regardless of the score or the end of the line (which might change if the model is replaced, while we would expect the same top-1 result).

Also, the German and English names have much overlapping:

translation = {
    'griechisch': 'greek',
    'hebräisch': 'hebrew',
    'kursiv': 'italic',
    'andere_schrift': 'other_font',
    'nicht_schrift': 'not_a_font'
}

We could as well consider any output containing a German name to be wrong.