OCR-D / ocrd_anybaseocr

DFKI Layout Detection for OCR-D
Apache License 2.0
48 stars 12 forks source link

ocrd-anybaseocr-binarize handles multiple output groups not correct #39

Closed VolkerHartmann closed 4 years ago

VolkerHartmann commented 4 years ago

ocrd-anybaseocr-binarize -m weigel/data/mets.xml -I OCR-D-IMG -O OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN -w weigel/data create the following files: weigel/data/OCR-D-IMG-BIN: OCR-D-IMG-BIN_0001.png OCR-D-IMG-BIN_0002.png OCR-D-IMG-BIN_0003.png OCR-D-IMG-BIN_0004.png

weigel/data/OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN: OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN_0001.xml OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN_0002.xml OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN_0003.xml OCR-D-IMG-BIN-IMG,OCR-D-IMG-BIN_0004.xml

VolkerHartmann commented 4 years ago

Same for deskewing but seems to be fine for cropping!?

mahmed1995 commented 4 years ago

Fixed in 7d3fffe

kba commented 4 years ago

Again, you either refer to the wrong commit or 7d3fffe does not fix it.

VolkerHartmann commented 4 years ago

For binarization and deskewing multiple outputgroups is still buggy. ocrd-anybaseocr-(bin/des) -i anyInput -o out1,out2 produces the following output: out1,out2: insgesamt 24 drwxr-xr-x 2 user user 4096 Jan 30 14:17 . drwxrwxr-x 11 user user 4096 Jan 30 14:17 .. -rw-r--r-- 1 user user 2697 Jan 30 14:17 out1_0001.xml -rw-r--r-- 1 user user 2699 Jan 30 14:17 out1_0002.xml -rw-r--r-- 1 user user 2696 Jan 30 14:17 out1_0003.xml -rw-r--r-- 1 user user 2699 Jan 30 14:17 out1_0004.xml

out2: insgesamt 636 drwxr-xr-x 2 user user 4096 Jan 30 14:17 . drwxrwxr-x 11 user user 4096 Jan 30 14:17 .. -rw-r--r-- 1 user user 170291 Jan 30 14:17 out2_0001.png -rw-r--r-- 1 user user 161358 Jan 30 14:17 out2_0002.png -rw-r--r-- 1 user user 147802 Jan 30 14:17 out2_0003.png -rw-r--r-- 1 user user 152206 Jan 30 14:17 out2_0004.png

mahmed1995 commented 4 years ago

That was wrong commit. This is the correct one a4c6c6ce

VolkerHartmann commented 4 years ago

For binarization, deskewing and dewarping the handling of multiple output groups is still buggy. ocrd-anybaseocr-(bin/des) -i anyInput -o out1,out2 produces the following output: out1,out2: insgesamt 24 drwxr-xr-x 2 user user 4096 Jan 30 14:17 . drwxrwxr-x 11 user user 4096 Jan 30 14:17 .. -rw-r--r-- 1 user user 2697 Jan 30 14:17 out1_0001.xml

It still produces a folder with the complete string of the outputgroups out1,out2 instead of just out1! The USE attribute of the filegroup inside mets file is correct.

It works fine for crop. Why do binarize and deskewing not behave the same?

mahmed1995 commented 4 years ago

Commit: 092183d

VolkerHartmann commented 4 years ago

:+1: Works fine for binarization and deskewing. Please fix it also for dewarping.

mahmed1995 commented 4 years ago

Commit: 2dc4de2 Can this issue be closed now?

VolkerHartmann commented 4 years ago

:+1: