OpenPecha / OCR-Pipelines

1 stars 0 forks source link

fix(executor) #31

Closed kaldan007 closed 1 year ago

kaldan007 commented 1 year ago

ocr output path naming has been updated. After testing the pipeline via a failed case (W21784), i found a bug. The particular case that we have encounter has mainly two issues. Those two issues are the ocr output is completely nonsensical and opf published are not having base. After running test, i found that nonsensical ocr output can be avoided by selecting builtin-weekly type model. Regarding the empty base, the ocr output is saved under folder name having whole image group id. Whereas opf formatter look for ocr output it uses a function of buda api in openpecha to navigate the folder. Due to this, ocr output are not found hence the empty base in opf. in order to avoid this, the ocr output has been saved under folder following same naming convention as buda api function.

eroux commented 1 year ago

Thanks!