Closed morpheus-sapiens-amans closed 10 months ago
I've just found this other post that shows the same problem as I do.
Hi, I wanted to check with you guys what are the parameters for the OCR plugin on linux. I went to the github page and found the link to configuration for mac and linux does not work correctly. So here are my config param:
First, I installed tesseract-ocr provided by my repositories.
For the OCR engine: /usr/bin/tesseract For pdftoppm: /usr/bon/pdftoppm For the language script: script/Latin
Is /usr/share/tesseract-ocr
not only the directory? Then you should try something like /usr/share/tesseract-ocr/tesseract
instead. But you can also try to leave it simple empty, then the default options will be tried out:
https://github.com/UB-Mannheim/zotero-ocr/blob/9a1e87c8e5a588c7f9c046f812ee7de55f277ec1/chrome/content/zoteroocr.js#L36
If this does not help, then activate the debug log and look what exactly is tried to call tesseract.
/usr/bin/tesseract
is the correct setting for the typical installation on Linux.
Hi, I'm having a similar problem. In my case, I have set the right path. But it tells no executable found.
~ ❯ which tesseract
/usr/bin/tesseract
~ ❯ which pdftoppm
/usr/bin/pdftoppm
I don't know why.
For me it was because I was using the flatpak version which messes everything up. Reinstalling from the tarball as described here made everything work.
Hi, I'm having a similar problem. In my case, I have set the right path. But it tells no executable found.
~ ❯ which tesseract /usr/bin/tesseract ~ ❯ which pdftoppm /usr/bin/pdftoppm
I don't know why.
Same problem here on Nobara Linux 38 Wayland (GNOME 44.2).
Try to activate the debug output in Zotero and then select a test PDF and click on the "OCR selected PDF(s)". Then in the debug output you should see exactly the path used to call the different tools.
When I tested it today, I found that the issue seemed to be resolved. It's been a long time since then, so I don't know what caused the problem at that time.
tesseract-ocr is he engine used by Zotero OCR to recognize and extract content, but the installation guide only shows the path for windows machine.
whereis tessarect-ocr
to locate the path for the engine and I got/usr/share/tesseract-ocr
as a result, but when I applied to the preferences in Zotero, it says no executable found.Does anyone knows what to do to config?
thanks