Closed lodzen closed 3 months ago
Thanks for reporting this. Unfortunately I cannot reproduce the issue. Here is what I did:
ocrmypdf
inside of the containersudo -u www-data php cron.php
Result: new file version is created as expected, text is markable inside of the document
Please use our troubleshooting guide and repeat your process. If you decreased your logging level like described, there must be some server logs. Those are mandatory for us to understand the problem.
Thanks for your help
Hello,
i setup the flow exactly as in your screenshot:
The cron is configured to run all 5min:
Test pdf:
Even after 15 min file was not analyzed:
The difference is that its not a personal flow at my end its a global one
Ok but even with a personal flow the Job is not executed
Your frontend configuration looks correct. Nevertheless, without additional backend logs it will be impossible to find the error. Like described here, please decrease your NC loglevel, repeat the process (don't forget to execute the cron manually) and post your logs here.
I created the logs now and tried to prefilter it as best as possible flow.log nextcloud.log
There are two interesting lines in your nextcloud.log
, one is logged by the workflowengine
itself and the other is logged by this app (workflow_ocr
):
{"reqId":"jiLiFxRBg9xJAAFAsV97","level":0,"time":"2024-01-25T09:32:59+00:00","remoteAddr":"79.249.68.60","user":"daniel","app":"workflowengine","method":"GET","url":"/core/preview?fileId=11909&x=250&y=250","message":"No flow configurations is going to run OCR-Datei","userAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","version":"28.0.1.1","data":{"app":"workflowengine","level":"0"},"id":196}
{"reqId":"jiLiFxRBg9xJAAFAsV97","level":0,"time":"2024-01-25T09:32:59+00:00","remoteAddr":"79.249.68.60","user":"daniel","app":"workflow_ocr","method":"GET","url":"/core/preview?fileId=11909&x=250&y=250","message":"Not processing event because IRuleMatcher->getFlows did not return anything","userAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","version":"28.0.1.1","data":{"app":"workflow_ocr"},"id":197}
I think the interesting bit here is No flow configurations is going to run OCR-Datei
, which tells us that there might be some misconfiguration in your workflow. The second line tells us basically the same Not processing event because IRuleMatcher->getFlows did not return anything
.
At the moment I have no idea why it behaves like this for you but it doesn't seem to be a general problem with Nextcloud 28 since I can't reproduce the problem. I think further investigation is needed here.
If you setup a workflow with the same conditions (file created/updated, mimetype is PDF) and you use the "Workflow Tagging", does this one work? So will it tag your PDF files correctly?
Some technical details:
Both log messages are produced by the workflowengine
of Nextcloud, which contains the core-logic for workflow apps. In this case the getFlows
method is called by our workflow_ocr
app and the core logic of Nextcloud tells the app to "not run".
Describe the bug
Trigger OCR if file was created or updated is not working
System
ocrmypdf
version: 14.1.0-r1How to reproduce
Steps to reproduce the behavior: Configure the Workflow as mentioned in the manual The OCR is not triggered by Nextcloud if a file was added or modified.
Screenshots
Conversion is only working for Tags Reason why the ocr Tag is added multiple times is because the automated tagging workflow is used on top
Server log
Please paste relevant content of your
nextcloud.log
file here. It might make sense to first decrease the Loglevel. Also, since the OCR process runs asynchronously, run your cron.php before copying the logs here.