Open pickleton89 opened 1 year ago
Thank you for your issue. Can you confirm that you're using the newest version of obsidian-ocr, which is currently 2.0.0?
Yes. Everything thing is up to date. Also my obsidian is Version 1.1.12 (Installer 1.1.9)
On Feb 4, 2023, at 4:12 PM, Jonas Mohr @.***> wrote:
Thank you for your issue. Can you confirm that you're using the newest version of obsidian-ocr, which is currently 2.0.0?
— Reply to this email directly, view it on GitHub https://github.com/MohrJonas/obsidian-ocr/issues/36#issuecomment-1416873205, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEF5454SAKTL2RGINXIGJKTWV3O65ANCNFSM6AAAAAAURHPJ34. You are receiving this because you authored the thread.
Alright, thanks for checking.
When you say the files have hocr added to the name, do you mean you have a file x.png
and also a file x.png.hocr
?
I sorted the files alphabetically and see that there is the base .png files and then another file, with the same name with the x.phg.hocr. I hadn’t noticed that before.
Jeff
On Feb 5, 2023, at 11:17 AM, Jonas Mohr @.***> wrote:
Alright, thanks for checking. When you say the files have hocr added to the name, do you mean you have a file x.png and also a file x.png.hocr?
— Reply to this email directly, view it on GitHub https://github.com/MohrJonas/obsidian-ocr/issues/36#issuecomment-1418206091, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEF5454MB673SK3LLGN2TQDWV7VFPANCNFSM6AAAAAAURHPJ34. You are receiving this because you authored the thread.
Alright, it's good to hear that the original file is still there, was a bit scared that I screwed something up, and it deleted the files 😌
The .hocr files are remnants of an older version of obsidian-ocr that stored the OCR information for x.png
in x.png.hocr
.
You can either leave them there and ignore them, or simply delete them.
Since version 2.0.0, the information is stored in a SQLite database, called .obsidian-ocr.sqlite
, in the root of your vault.
Concerning the problem you described above: Could you please open the developer console and see if any errors are reported?
I opened the OCR search command and typed in a query and got this error in console after pressing return on the search.

On Feb 5, 2023, at 1:49 PM, Jonas Mohr @.***> wrote:
Alright, it's good to hear that the original file is still there, was a bit scared that I screwed something up, and it deleted the files 😌 The .hocr files are remnants of an older version of obsidian-ocr that stored the OCR information for x.png in x.png.hocr. You can either leave them there and ignore them, or simply delete them. Since version 2.0.0, the information is stored in a SQLite database, called .obsidian-ocr.sqlite, in the root of your vault.
Concerning the problem you described above: Could you please open the developer console and see if any errors are reported?
— Reply to this email directly, view it on GitHub https://github.com/MohrJonas/obsidian-ocr/issues/36#issuecomment-1418260986, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEF5455W4RIWOITP35BB74TWWAG4XANCNFSM6AAAAAAURHPJ34. You are receiving this because you authored the thread.
Unfortunately, I can't seem to see the attached image
Sorry about that was replying by email and image didn't come through. I posted it above.
Alright, thanks for the image.
Could you please enable Log to file
in the settings of obsidian-ocr, restart obsidian and perform the same steps you did to produce the error above.
After that, could you please attach the log file.
Sorry, but I a not sure were the the log file gets created and how to find it.
On Feb 5, 2023, at 5:49 PM, Jonas Mohr @.***> wrote:
Alright, thanks for the image. Could you please enable Log to file in the settings of obsidian-ocr, restart obsidian and perform the same steps you did to produce the error above. After that, could you please attach the log file.
— Reply to this email directly, view it on GitHub https://github.com/MohrJonas/obsidian-ocr/issues/36#issuecomment-1418330433, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEF5456NV5LUK24YDZW5ICLWWBDAFANCNFSM6AAAAAAURHPJ34. You are receiving this because you authored the thread.
I found the file. obsidian-ocr.log
Thank you for the log. I think I have somewhat of an idea what's going on here. Could you please tell me which os you're using?
I am currently running macOS Ventura 13.2
Okay, and could you tell me the output of tesseract -v
in your terminal?
tesseract 5.2.0 leptonica-1.82.0 libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.3) : libpng 1.6.39 : libtiff 4.4.0 : zlib 1.2.11 : libwebp 1.2.4 : libopenjp2 2.5.0 Found NEON Found libarchive 3.6.2 zlib/1.2.11 liblzma/5.2.9 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.2 Found libcurl/7.86.0 SecureTransport (LibreSSL/3.3.6) zlib/1.2.11 nghttp2/1.47.0
Alright, after looking at the logs I know why it doesn't work, but I don't know why it doesn't work. This is the code responsible for running tesseract:
...
const execReturn = exec(`tesseract ${this.settings.additionalArguments} "${source}" stdout -l ${this.settings.lang} hocr`);
...
This would (as an example) translate into a command like that:
tesseract "/some/file.png" stdout -l eng
The problem is (as can be seen in the log), that tesseract for some reason misinterprets the command, giving the following error message:
read_params_file: Can't open stdout
read_params_file: Can't open -l
read_params_file: Can't open eng
Error, cannot read input file undefined: No such file or directory
Error during processing.
As can be seen in the error message, tesseract tries to open stdout, -l and eng as files, even though they are just command line arguments, which is quite strange. I can only assume, that there is some sort of problem with the way the file path is handed to tesseract, because when I input the bogus command tesseract some/file path/image.png stdout -l eng
, I get a similar error:
read_params_file: Can't open stdout
read_params_file: Can't open -l
read_params_file: Can't open eng
On the other hand, the file path is wrapped with ""
, which cause tesseract to behave properly again.
Therefore, some more investigation is necessary so sit tight 😊
I ran the path checks as listed below:
(base) [~]$ brew list tesseract /opt/homebrew/Cellar/tesseract/5.2.0/bin/tesseract /opt/homebrew/Cellar/tesseract/5.2.0/include/tesseract/ (12 files) /opt/homebrew/Cellar/tesseract/5.2.0/lib/libtesseract.5.dylib /opt/homebrew/Cellar/tesseract/5.2.0/lib/pkgconfig/tesseract.pc /opt/homebrew/Cellar/tesseract/5.2.0/lib/ (2 other files) /opt/homebrew/Cellar/tesseract/5.2.0/share/tessdata/ (35 files) (base) [~]$ brew list tesseract-lang /opt/homebrew/Cellar/tesseract-lang/4.1.0/share/tessdata/ (162 files) (base) [~]$ brew list imagemagick /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/Magick++-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/MagickCore-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/MagickWand-config /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/animate /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/compare /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/composite /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/conjure /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/convert /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/display /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/identify /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/import /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/magick /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/magick-script /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/mogrify /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/montage /opt/homebrew/Cellar/imagemagick/7.1.0-54/bin/stream /opt/homebrew/Cellar/imagemagick/7.1.0-54/etc/ImageMagick-7/ (13 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/include/ImageMagick-7/ (137 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagick++-7.Q16HDRI.5.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagickCore-7.Q16HDRI.10.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/libMagickWand-7.Q16HDRI.10.dylib /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/ImageMagick/ (261 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/pkgconfig/ (8 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/lib/ (9 other files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/ImageMagick-7/ (3 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/doc/ (332 files) /opt/homebrew/Cellar/imagemagick/7.1.0-54/share/man/ (17 files)
A number of my png files in my attachment folders have .hocr added to name of file. They will not open now. Additionally, when I invoke the OCR search window, it doesn't find anything. I see when opening Obsidian that the indexing counter finishes.