Calamari-OCR / calamari_models

Pretrained mixed models to be used with Calamari.
MIT License
58 stars 17 forks source link

U+EADA with using antiqua_historical_ligs 2020-06-05 #13

Open jbarth-ubhd opened 2 years ago

jbarth-ubhd commented 2 years ago

When using https://github.com/Calamari-OCR/calamari_models/raw/d61781a9a17e20ca38faf71478185585ea227fd9/antiqua_historical_ligs/0.ckpt.h5 +*.ckpt.* with current ocrd_all docker image and this scan:

https://digi.ub.uni-heidelberg.de/diglitData/v/montfaucon1719bd2_1.210.tif

I'll get this XML:

...
ſe mit devant les rangs; & approchant de Xanthe, il uſa dune tromperie
qui lui reuit: E ce agir en honnete homme, dit. il, damener un ſecond,
jbarth-ubhd commented 2 years ago

image

jbarth-ubhd commented 2 years ago

Private-use area codepoints detected so far:

U+e8bf ("C"? "q" open on the right side with "'" above (printing error)?) banier1754bd1/02_0500 drwBesold1679/0052 drwBesold1679/0112 drwBesold1679/0..
U+eada "ſt"-Ligature  arndt1722/000000_n arndt1722/000000_o arndt1722/000000_q arndt1722/000000_r..
U+eba3 "fl"-Ligature  arndt1722/0032 arndt1722/0268 arndt1722/0284_cy arndt1722/0394 arndt1722/05..
U+eba6 "ſſ"-Ligature  arndt1722/000000_n arndt1722/000000_q arndt1722/000000_r arndt1722/000000_s..
U+eedc "tz"-Ligature  arndt1722/000000_n arndt1722/000000_q arndt1722/000000_s arndt1722/000000_t..
U+f159 Antiqua-"S"/Fraktur-"F"? arndt1722/000000_n arndt1722/000000_t arndt1722/000000_u arndt1722/000000_x..