mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
747 stars 131 forks source link

What am I doing wrong? #656

Open johnlockejrr opened 1 week ago

johnlockejrr commented 1 week ago

I'm trying since some time to fine tune a segmentation model for Syriac script with vowels (above and below line), I'm getting closer but not enough. Kraken seems to refuse to comply :)

Ground truth (page-xml):

seg_manual

After fine tuning with: ketos segtrain -d cuda:0 -f page -t output.txt -q early --min-epochs 100 -cl --threads 10 --resize both --schedule reduceonplateau -i BiblIAlong02_se3_2_tl.mlmodel -o out/syrnt_cl_v1

seg_auto

Any idea? Should I just use bounding boxes around the baselines and not polygons?

johnlockejrr commented 1 week ago

Fine tuning on blla way better for text lines but loss in classes...

(ketos segtrain -d cuda:0 -f page -t output.txt -q early --min-epochs 60 --threads 10 --resize both --schedule reduceonplateau -i blla.mlmodel -o out_blla/syrnt_blla_v1)

image

johnlockejrr commented 1 week ago

Now is way better... anyway, should I add more padding?

[ketos segtrain -d cuda:0 -f page -t output-syrnt_cl_two.txt -q early --min-epochs 50 --threads 10 --resize both --schedule reduceonplateau -i blla.mlmodel -o out_blla/syrnt_two_blla_v1]

Screenshot 2024-11-05 115910