R0Wi-DEV / workflow_ocr

This is a Nextcloud Workflow App which enables you to process files via OCR on serverside.
GNU Affero General Public License v3.0
79 stars 6 forks source link

misleading text - [workflow_ocr] Warning: OCRmyPDF succeeded with warning(s): sh: ocrmypdf: not found #259

Open ferdiga opened 1 month ago

ferdiga commented 1 month ago

Describe the bug

instead of "succeeded" it should read "FAILED" [workflow_ocr] Warning: OCRmyPDF FAILED with warning(s): sh: ocrmypdf: not found

this BTW happens because these packages are missing in the nextcloud-aio-nextcloud container , and I didn't find a way to permanently add these. Hence after every container update these have to be installed again.

docker exec -it nextcloud-aio-nextcloud bash -c "apk add libreoffice" # not sure that's really necessary docker exec -it nextcloud-aio-nextcloud bash -c "apk add ocrmypdf" docker exec -it nextcloud-aio-nextcloud bash -c "apk add tesseract-ocr" docker exec -it nextcloud-aio-nextcloud bash -c "apk add tesseract-ocr-data-fra" docker exec -it nextcloud-aio-nextcloud bash -c "apk add tesseract-ocr-data-deu" docker exec -it nextcloud-aio-nextcloud bash -c "apk add tesseract-ocr-data-eng"

System

How to reproduce

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Screenshots

If applicable, add screenshots to help explain your problem.

Server log

Please paste relevant content of your nextcloud.log file here. It might make sense to first decrease the Loglevel. Also, since the OCR process runs asynchronously, run your cron.php before copying the logs here.

Paste relevant server log lines here. Make sure to trim sensitive information.

Browser log

If you're observing Browser errors, please paste your developer tools logs here.

Help for Chrome: https://developer.chrome.com/docs/devtools/console/#view Help for Firefox: https://firefox-source-docs.mozilla.org/devtools-user/browser_console/index.html

Paste your developer tools logs here. 

Additional context

Add any other context about the problem here.

R0Wi commented 1 month ago

The only reliable way of mitigating this is probably by checking the exitcode for 127 here

https://github.com/R0Wi-DEV/workflow_ocr/blob/d129a88b67b05dbf8e0bf05ee2d8d57fd8f32356/lib/OcrProcessors/OcrMyPdfBasedProcessor.php#L73

Interesting though that command->execute() obviously returns true