Open bmwmy opened 2 months ago
Can you show me which commands exactly you're running + could you give an XML file + image where this occurs? The segmentation has not changed, only the line extraction before feeding into the recognizer is new. It is disabled by default for pre-5.0 models though so I'm wondering where your issues come from.
this is the command
kraken -i "yarab_deskewed.png" "yarab.txt" segment -bl ocr -m arabic_best.mlmodel
Kraken_Dated_07-09-2022.pdf
Kraken_4.13.20.pdf
kraken_5dev23.pdf
yarab_deskewed (the original file being OCRed)
in every major update in kraken, decreased accuracy being noted
Hi I tried the same page with same setup with both Kraken 5.x and Kraken 4.x with provided Arabic_best.ml and there is more errors in the latest version (5.x) I think this relate to changes in segmenter which now been modified to allow curly segments which is probably not good for Arabic (I cannot find the issue #).