Closed alicedoe closed 6 years ago
Try this workarounds: https://github.com/keensoft/alfresco-simple-ocr/wiki/FAQ
thanks for the answer it really help to find the problem using /bin/su i've got an issue regading my locale variables :
`Traceback (most recent call last):
File "/bin/ocrmypdf", line 7, in
This system lists a couple of UTF-8 supporting locales that you can pick from. The following suitable locales were discovered: en_AG.utf8, en_AU.utf8, en_BW.utf8, en_CA.utf8, en_DK.utf8, en_GB.utf8, en_HK.utf8, en_IE.utf8, en_IN.utf8, en_NG.utf8, en_NZ.utf8, en_PH.utf8, en_SG.utf8, en_US.utf8, en_ZA.utf8, en_ZM.utf8, en_ZW.utf8`
it's weird because i set it in my dockerfile
Any solution on this as I am having similar problem. I had simple-ocr with ocrmypdf setup on another system with no problem. trying to set up a new system and getting. The script runs fine when done outside Alfresco. I also tried the alternative scripts mentioned here.
Thanks GH
Execution result:
os: Linux
command: /opt/alfresco/scripts/ocrmypdf.sh /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_4220864260626106298.pdf /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_4220864260626106298_ocr.pdf
succeeded: false
exit code: 1
out:
err: Traceback (most recent call last):
File "/usr/local/bin/ocrmypdf", line 7, in
So if you run the script
$ /opt/alfresco/scripts/ocrmypdf.sh /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_4220864260626106298.pdf /opt/alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_4220864260626106298_ocr.pdf
from command line is working fine, right?
yes. Also. Thanks for a great product.
Hello. I came back to see if I could get up and running again and was able to sort above mentioned problem by editing main.py and removing verify_python3_env(
Error I had was
File "/usr/local/lib/python3.5/dist-packages/ocrmypdf/__main__.py", line 69, in <module>
verify_python3_env(
at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79)
... 78 more
Thanks.
Try this workarounds: https://github.com/keensoft/alfresco-simple-ocr/wiki/FA
Try this workarounds: https://github.com/keensoft/alfresco-simple-ocr/wiki/FAQ
Hello @angelborroy-ks
I have this issue ... i'm using Ocrmypdf with alfresco ... Ocrmypdf work well manually using the command ... but when I use it with alfresco 'OCR action' does'nt work ... this is the log :
Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 000817996 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /usr/local/bin/ocrmypdf --verbose 1 --force-ocr -l eng /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_4887267237326407155.pdf /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_4887267237326407155_ocr.pdf
succeeded: false
exit code: 1
out:
err: Traceback (most recent call last):
File "/usr/local/bin/ocrmypdf", line 5, in
thankx for your help
Hi,
i have an issue using alfresco-simple-ocr and facing this error when i tried to OCR a pdf :
Exception in thread "defaultAsyncAction1" java.lang.RuntimeException: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 10220020 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/ocrmypdf --verbose 1 --force-ocr -l eng /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478.pdf /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 7, in <module> from ocrmypdf.__main__ import run_pipeline File "/usr/lib/python3.5/site-packages/ocrmypdf/__main__.py", line 53, in <module> _unicodefun._verify_python3_env at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183) at es.keensoft.alfresco.ocr.OCRExtractAction.access$200(OCRExtractAction.java:38) at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:164) at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:161) at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:464) at es.keensoft.alfresco.ocr.OCRExtractAction.executeInNewTransaction(OCRExtractAction.java:169) at es.keensoft.alfresco.ocr.OCRExtractAction.access$100(OCRExtractAction.java:38) at es.keensoft.alfresco.ocr.OCRExtractAction$ExtractOCRTask.run(OCRExtractAction.java:151) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 10220020 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/ocrmypdf --verbose 1 --force-ocr -l eng /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478.pdf /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 7, in <module> from ocrmypdf.__main__ import run_pipeline File "/usr/lib/python3.5/site-packages/ocrmypdf/__main__.py", line 53, in <module> _unicodefun._verify_python3_env at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86) at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181) ... 10 more Caused by: org.alfresco.service.cmr.repository.ContentIOException: 10220020 Failed to perform OCR transformation: Execution result: os: Linux command: /usr/bin/ocrmypdf --verbose 1 --force-ocr -l eng /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478.pdf /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_6969309335739725478_ocr.pdf succeeded: false exit code: 1 out: err: Traceback (most recent call last): File "/usr/bin/ocrmypdf", line 7, in <module> from ocrmypdf.__main__ import run_pipeline File "/usr/lib/python3.5/site-packages/ocrmypdf/__main__.py", line 53, in <module> _unicodefun._verify_python3_env at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79) ... 11 more
My config file:
### OCR config ### ocr.command=/usr/bin/ocrmypdf ocr.output.verbose=true ocr.output.file.prefix.command= ocr.extra.commands=--verbose 1 --force-ocr -l eng ocr.server.os=linux
If i use this command directly it's working the document is created : ocrmypdf --verbose 1 --force-ocr -l eng /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5341287848260715795.pdf /alfresco/tomcat/temp/Alfresco/OCRTransformWorker_source_5341287848260715795_ocr.pdf
I'm using centOS and alfresco with docker
thanks :)