MehmedGIT / OtoN_Converter

Converter from basic OCRD process workflow to Nextflow workflow script
Apache License 2.0
4 stars 1 forks source link

Using the same OCR-D processor twice in a workflow #6

Closed MehmedGIT closed 2 years ago

MehmedGIT commented 2 years ago

The following workflow would fail to produce an executable Nextflow script. The converter will convert the ocrd-olena-binarize, ocrd-cis-ocropy-deskew, and ocrd-segment-repair twice with the same Nextflow process name. Thus, the compilation of the Nextflow script will fail due to the name duplication.

   ocrd process \
   "olena-binarize -I OCR-D-IMG -O OCR-D-BIN -P impl sauvola" \
   "anybaseocr-crop -I OCR-D-BIN -O OCR-D-CROP" \
   "olena-binarize -I OCR-D-CROP -O OCR-D-BIN2 -P impl kim" \
   "cis-ocropy-denoise -I OCR-D-BIN2 -O OCR-D-BIN-DENOISE -P level-of-operation page" \
   "cis-ocropy-deskew -I OCR-D-BIN-DENOISE -O OCR-D-BIN-DENOISE-DESKEW -P level-of-operation page" \
   "tesserocr-segment-region -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG-REG" \
   "segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -P plausibilize true" \
   "cis-ocropy-deskew -I OCR-D-SEG-REPAIR -O OCR-D-SEG-REG-DESKEW -P level-of-operation region" \
   "cis-ocropy-clip -I OCR-D-SEG-REG-DESKEW -O OCR-D-SEG-REG-DESKEW-CLIP -P level-of-operation region" \
   "tesserocr-segment-line -I OCR-D-SEG-REG-DESKEW-CLIP -O OCR-D-SEG-LINE" \
   "segment-repair -I OCR-D-SEG-LINE -O OCR-D-SEG-REPAIR-LINE -P sanitize true" \
   "cis-ocropy-dewarp -I OCR-D-SEG-REPAIR-LINE -O OCR-D-SEG-LINE-RESEG-DEWARP" \
   "calamari-recognize -I OCR-D-SEG-LINE-RESEG-DEWARP -O OCR-D-OCR -P checkpoint /usr/users/<user>/ocrd/models/calamari-models/\*.ckpt.json"