keensoft / alfresco-simple-ocr

Simple OCR action for Alfresco
Other
44 stars 30 forks source link

Error ocrmypdf in Alfresco Linux version 6.1 #68

Open jrbrasil opened 4 years ago

jrbrasil commented 4 years ago

Hey guys, It is not generating the ocr within the Alfresco platform.

See the logs below:

tail -f /opt/alfresco/tomcat/logs/catalina.out

command: /opt/alfresco/scripts/ocrmypdf.sh --verbose 1 --force-ocr -l por+eng /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601.pdf /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 11, in load_entry_point('ocrmypdf==6.1.2', 'console_scripts', 'ocrmypdf')() File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 480, in load_entry_po at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183) at es.keensoft.alfresco.ocr.OCRExtractAction.access$200(OCRExtractAction.java:38) at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:164) at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:161) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:450) at es.keensoft.alfresco.ocr.OCRExtractAction.executeInNewTransaction(OCRExtractAction.java:169) at es.keensoft.alfresco.ocr.OCRExtractAction.access$100(OCRExtractAction.java:38) at es.keensoft.alfresco.ocr.OCRExtractAction$ExtractOCRTask.run(OCRExtractAction.java:151) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 08140019 Failed to perform OCR transformation: Execution result: os: Linux command: /opt/alfresco/scripts/ocrmypdf.sh --verbose 1 --force-ocr -l por+eng /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601.pdf /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 11, in load_entry_point('ocrmypdf==6.1.2', 'console_scripts', 'ocrmypdf')() File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 480, in load_entry_po at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86) at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181) ... 10 more Caused by: org.alfresco.service.cmr.repository.ContentIOException: 08140019 Failed to perform OCR transformation: Execution result: os: Linux command: /opt/alfresco/scripts/ocrmypdf.sh --verbose 1 --force-ocr -l por+eng /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601.pdf /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5414022605894367601_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 11, in load_entry_point('ocrmypdf==6.1.2', 'console_scripts', 'ocrmypdf')() File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 480, in load_entry_po at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79)

root@pmituiutaba:/opt/alfresco/logs# gs --version 9.26

root@pmituiutaba:/opt/alfresco/logs# pip3 --version pip 20.2.3 from /usr/local/lib/python3.6/dist-packages/pip (python 3.6)

root@pmituiutaba:/opt/alfresco/logs# tesseract --version tesseract 4.0.0-beta.1 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0

Found AVX Found SSE

root@pmituiutaba:/opt/alfresco/logs# ocrmypdf --version 6.1.2

root@pmituiutaba:/opt/alfresco/logs# cat /etc/os-release NAME="Ubuntu" VERSION="18.04.5 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04.5 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic

cat alfresco.log | grep -i "Current version" 2020-09-15 00:04:09,348 INFO [org.alfresco.service.descriptor.DescriptorService] [localhost-startStop-1] Alfresco Content Services started (Community). Current version: 6.1.1 (r9d03d2fd-b168) schema 12,001. Originally installed version: 6.1.1 (r9d03d2fd-b168) schema 12,001.

cat /etc/sudoers #

This file MUST be edited with the 'visudo' command as root.

#

Please consider adding local content in /etc/sudoers.d/ instead of

directly modifying this file.

#

See the man page for details on how to write a sudoers file.

# Defaults env_reset Defaults mail_badpass Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"

Host alias specification

User alias specification

Cmnd alias specification

User privilege specification

root ALL=(ALL:ALL) ALL alfresco ALL=(ALL) NOPASSWD: ALL

Members of the admin group may gain root privileges

%admin ALL=(ALL) ALL

Allow members of group sudo to execute any command

%sudo ALL=(ALL:ALL) ALL

See sudoers(5) for more information on "#include" directives:

includedir /etc/sudoers.d

cat /opt/alfresco/tomcat/shared/classes/alfresco-global.properties | grep -i "ocr"

OCR mit OCRmyPDF

ocr.command=/opt/alfresco/scripts/ocrmypdf.sh ocr.output.verbose=false ocr.output.file.prefix.command= ocr.extra.commands=--verbose 1 --force-ocr -l por+eng ocr.server.os=linux

/opt/alfresco/modules/share# l total 12K -rw-r--r-- 1 root root 12K Sep 14 18:48 simple-ocr-share-2.3.1.jar

/opt/alfresco/modules/platform# l total 28K -rw-r--r-- 1 root root 28K Sep 14 18:48 simple-ocr-repo-2.3.1.jar print-tela-modules

Can you help please?