Open mikegerber opened 5 years ago
For a lot of different data, ocrd-ocropy-segment throws an exception. Here for 5 of the files from the OCR-D GT repo:
# zips all from https://ocr-d-repo.scc.kit.edu/api/v1/metastore/bagit for z in benner_herrnhuterey04_1748.ocrd.zip buerger_gedichte_1778.ocrd.zip estor_rechtsgelehrsamkeit02_1758.ocrd.zip lohenstein_agrippina_1665.ocrd.zip siles echo "== $z" cd `mktemp -d` cp /srv/data/OCR-D/$z . dtrx $z cd ${z//.zip}/data ocrd-ocropy-segment -l DEBUG -m mets.xml -I OCR-D-IMG -O OCR-D-SEG-LINE 2>&1 | tail -n 1 done
yields:
== benner_herrnhuterey04_1748.ocrd.zip 15:13:48.505 INFO ocrd.workspace - Saving mets '/tmp/tmp.NffpG878nI/benner_herrnhuterey04_1748.ocrd/data/mets.xml' == buerger_gedichte_1778.ocrd.zip ValueError: cannot convert float NaN to integer == estor_rechtsgelehrsamkeit02_1758.ocrd.zip ValueError: cannot convert float NaN to integer == lohenstein_agrippina_1665.ocrd.zip ValueError: cannot convert float NaN to integer == silesius_seelenlust01_1657.ocrd.zip 15:14:01.768 INFO ocrd.workspace - Saving mets '/tmp/tmp.26Cn1zFHby/silesius_seelenlust01_1657.ocrd/data/mets.xml'
% pip list | grep ocrd-ocropy ocrd-ocropy 0.0.3
For a lot of different data, ocrd-ocropy-segment throws an exception. Here for 5 of the files from the OCR-D GT repo:
yields: