Closed Nocturna22 closed 1 year ago
Think you've set your conditions wrong. A file cannot be an image AND a pdf at the same time ;-)
Our README might help you getting you there. Let me know if that works
Thanks for the quick answer :) First i felt very stupid xD Becaus i thougth, that these are seperate jobs i configured. Then i was happy because i thought my problem would be solved as simple as that... but unfortunately that didn't work either. I have read the README more than once xD I have now tried the non-admin variant for test purposes. It did not work there.
I have tested most of the criteria. e.g. request time between 12:00 and 11:59 should really work. Then I selected the OCR mode "Force OCR" and uploaded more than 100 documents to be sure. After executing the cron.php (which took longer than usual (much longer), so it must be doing something) I still didn't have any PDF documents in the same folder. I also checked this in the terminal. After that I searched the whole server with several methods for pdf's, but only found the ones that were already there. I don't know what else I can do. There is nothing in the protocol either.
Greetings from the Schwarzwald ;)
EDIT: Okay, the big uplaod and cronjob did, in fact, produce something... But unfortunately no PDF's. Just errors
Fehler | workflow_ocr | OCR for file /8BitBrainz/files/12366/IMG_20210707_175631.jpg not possible. Message: OCRmyPDF did not produce any output
Ok here are a few things I'd suggest:
loglevel
temporarily to DEBUG
(0
) to get some extra logsnextcloud.log
(or use the logreader app). You should see at least one line starting with "Adding file to jobqueue: "Depending on if you can see such a line like mentioned in 4., you can go on by checking if you can see an appropriate entry in the database oc_jobs
table and try to run the cron.php
again. If you can't see such a log message then the file isn't added to OCR processing queue for some reason
EDIT: Okay, the big uplaod and cronjob did, in fact, produce something... But unfortunately no PDF's. Just errors Fehler | workflow_ocr | OCR for file /8BitBrainz/files/12366/IMG_20210707_175631.jpg not possible. Message: OCRmyPDF did not produce any output
Okay but it seems that at least your setup is correct now. Please either decrease your loglevel or try to process the file via ocrmypdf
directly and see what it tells you? Otherwise to keep things a bit simpler: drop me a PM if you like.
Greets from the Bodensee ;-)
Long Stroy short:
There is a Problem with the "remove-background" function.
Thanks for the tips with debugging. That will help me a lot in the future :D
I don't know if it would be good if we move this to the PN. Maybe someone will have the same problem sometime.
I mean, you can only change the loglevel in the Nextcloud config.php, right? And that only updates when I restart apache, right? Because I have an active command running right now that will take some time. I don't know if that will get messed up when I restart apache. But the loglevel thing gave me an idea. I did not have the warnings enabled. (recently only because it was written that it is not important (I was spammed because of the unconfigured SMTP server)).
When I enabled the warnings, I got an error about the remove background function (i activated it after the switch from the adminworkflow to the userworkflow for testing and forgot about it -.-"):
The important information of this Message is "remove-background is temporarily not implemented"
OCRmyPDF succeeded with warning(s): An exception occurred while executing the pipeline concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.10/concurrent/futures/process.py", line 246, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 192, in exec_page_sync ocr_image, preprocess_out = make_intermediate_images( File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 126, in make_intermediate_images ocr_image = preprocess_out = preprocess( File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 104, in preprocess image = preprocess_remove_background(image, page_context) File "/usr/lib/python3/dist-packages/ocrmypdf/_pipeline.py", line 469, in preprocess_remove_background raise NotImplementedError("--**remove-background is temporarily not implemented")** NotImplementedError: --remove-background is temporarily not implemented """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 385, in run_pipeline exec_concurrent(context, executor) File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 274, in exec_concurrent executor( File "/usr/lib/python3/dist-packages/ocrmypdf/_concurrent.py", line 82, in __call__ self._execute( File "/usr/lib/python3/dist-packages/ocrmypdf/builtin_plugins/concurrency.py", line 135, in _execute result = future.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.__get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception NotImplementedError: --remove-background is temporarily not implemented,
I forgot to say: The PDF's are now generated successfully after disabeling the remove-background function :D So there is a problem, but I found it only by a human error ^^
Have a nice Day!
And sorry for stealing your time .-.
Edit: You definitely need to add a "buy me a coffee" button ;D
Thanks for the tips with debugging. That will help me a lot in the future :D
Hopefully you won't need it in the future :smile_cat:
I forgot to say: The PDF's are now generated successfully after disabeling the remove-background function :D
Glad to hear that things are working now. I think you're might be hitting https://github.com/ocrmypdf/OCRmyPDF/issues/884. I didn't have these problemes in the past since I use the official Debian packages which aren't updated very regularly. So falling back to a ocrmypdf
-version prior to v13 might also fix it.
And sorry for stealing your time .-.
No worries, like I said, I'm glad to help :rocket: Have a nice day, too :+1:
Edit: You definitely need to add a "buy me a coffee" button ;D
Button added :smile: https://www.buymeacoffee.com/R0Wi
Glad to hear that things are working now. I think you're might be hitting ocrmypdf/OCRmyPDF#884. I didn't have these problemes in the past since I use the official Debian packages which aren't updated very regularly. So falling back to a
ocrmypdf
-version prior to v13 might also fix it.
I will have a look there :)
Button added 😄 https://www.buymeacoffee.com/R0Wi
Somehow my life is peppered with errors -.-" But i WILL buy you a coffee when this error is gone xD
Hello :)
Nextcloudversion: 25.0.3
Workflow OCR version: 1.25.2
Somehow the workflow i created does not work. I created a new workflow at the Adminsection.
Settings:
After that, i uploaded 2 Files. 1 JPG & 1 PNG. None of them were converted after i entered this command:
sudo -u www-data /usr/bin/php cron.php
How does cron.php handle the new conversations? Does it add them to the end of the list? Because im currently detecting Faces using the "recognize" nextcloudapp (i think it uses tesseract) and its taking a loooong time .-.
I tested the backend ocrmypdf with and without root with the following command:
ocrmypdf --image-dpi 300 ~/10.jpg ~/myfile.pdf
It converts my files everytime.I don't know what to do now or how to troubleshoot...
Greetings