keensoft / alfresco-simple-ocr

Simple OCR action for Alfresco
Other
44 stars 30 forks source link

unpaper -V failed #61

Open Xavier74 opened 5 years ago

Xavier74 commented 5 years ago

Hello,

I'm trying to configure ocr on alfresco community edition 5.2 with pdfsandwich, on a Debian 10 inside VM. Everything is installed and configured, pdf OCRisation works when launched manually via command line. But when process is started from alfresco, I get an error at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183) at es.keensoft.alfresco.ocr.OCRExtractAction.executeImpl(OCRExtractAction.java:119) at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:273) at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:856) at org.alfresco.repo.action.executer.CompositeActionExecuter.executeImpl(CompositeActionExecuter.java:73) at org.alfresco.repo.action.executer.ActionExecuterAbstractBase.execute(ActionExecuterAbstractBase.java:273) at org.alfresco.repo.action.ActionServiceImpl.directActionExecution(ActionServiceImpl.java:856) at org.alfresco.repo.action.ActionServiceImpl.executeActionImpl(ActionServiceImpl.java:757) at org.alfresco.repo.action.AsynchronousActionExecutionQueueImpl$ActionExecutionWrapper$1$1.execute(AsynchronousActionExecutionQueueImpl.java:430) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:464) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:333) at org.alfresco.repo.action.AsynchronousActionExecutionQueueImpl$ActionExecutionWrapper$1.doWork(AsynchronousActionExecutionQueueImpl.java:439) at org.alfresco.repo.tenant.TenantUtil.runAsWork(TenantUtil.java:126) at org.alfresco.repo.tenant.TenantUtil.runAsTenant(TenantUtil.java:95) at org.alfresco.repo.tenant.TenantUtil$1.doWork(TenantUtil.java:69) at org.alfresco.repo.security.authentication.AuthenticationUtil.runAs(AuthenticationUtil.java:555) at org.alfresco.repo.tenant.TenantUtil.runAsUserTenant(TenantUtil.java:65) at org.alfresco.repo.action.AsynchronousActionExecutionQueueImpl$ActionExecutionWrapper.run(AsynchronousActionExecutionQueueImpl.java:442) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 10060569 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/pdfsandwich -verbose -lang spa+eng+fra /opt/alfresco-community-5.2/tomcat/temp/Alfresco/OCRTransformWorker_source_6628984190029574112.pdf -o /opt/alfresco-community-5.2/tomcat/temp/Alfresco/OCRTransformWorker_source_6628984190029574112_ocr.pdf succeeded: false exit code: 2 out: pdfsandwich version 0.1.7 Checking for convert: convert -version Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org Copyright: © 1999-2017 ImageMagick Studio LLC License: http://www.imagemagick.org/script/license.php Featur err: unpaper: symbol lookup error: /usr/lib/x86_64-linux-gnu/libgdk_pixbuf-2.0.so.0: undefined symbol: g_bytes_unref ERROR: Command "unpaper -V" failed. tesseract: /opt/alfresco-community-5.2/common/lib/libtiff.so.5: no version information available (req at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86) at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181) ... 20 more Caused by: org.alfresco.service.cmr.repository.ContentIOException: 10060569 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/pdfsandwich -verbose -lang spa+eng+fra /opt/alfresco-community-5.2/tomcat/temp/Alfresco/OCRTransformWorker_source_6628984190029574112.pdf -o /opt/alfresco-community-5.2/tomcat/temp/Alfresco/OCRTransformWorker_source_6628984190029574112_ocr.pdf succeeded: false exit code: 2 out: pdfsandwich version 0.1.7 Checking for convert: convert -version Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org Copyright: © 1999-2017 ImageMagick Studio LLC License: http://www.imagemagick.org/script/license.php Featur err: unpaper: symbol lookup error: /usr/lib/x86_64-linux-gnu/libgdk_pixbuf-2.0.so.0: undefined symbol: g_bytes_unref ERROR: Command "unpaper -V" failed. tesseract: /opt/alfresco-community-5.2/common/lib/libtiff.so.5: no version information available (req at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79) ... 21 more

Expected behavior

Actual behavior

Steps to reproduce the behavior

Additional details (analysis so far, log statements, references, etc.)

Tell us about your environment

FEATURE / ENHANCEMENT

If you are requesting a feature or enhancement, please provide as much information as possible and let us know how you will be able to contribute to resolving the request.

If you write code and can code up the solution, we welcome PRs. If you can do this but would like guidance from the core team let us know.

Are you willing/able to test any work we do towards your request?

If you plan to contribute to the project and you are not familiar with our current contribution policy, please make sure you have read that document (HINT: there is a link at the top of the page when you are creating an issue.)

Copy/past the given command on the cli with alfresco user works perfectly.

Do you have an idea on the issue ?

Many thanks in advance