scambier / obsidian-text-extractor

A (companion) plugin to facilitate the extraction of text from images (OCR) and PDFs.
GNU General Public License v3.0
307 stars 15 forks source link

[BUG] Image search not working. #22

Closed YousufSSyed closed 9 months ago

YousufSSyed commented 1 year ago

Problem description: I put in this image

IMG_77A176A93470-1

And then I ran the "vault search" command and search for "toggle scaling" (text in the photo, under "Mac Shortcuts") but the page I put the photo doesn't search up.

Your environment:

Things to try:

scambier commented 1 year ago

Did you install text-extractor and enable "index images" in omnisearch?

YousufSSyed commented 1 year ago

Yes, I've had both on for a while at the least.

scambier commented 1 year ago

Is it the only image to have this issue?

Right click on the image and select "extract text into a new note" to check which one of the plugins causes the issue. image

YousufSSyed commented 1 year ago

I just reinstalled text extractor and I don't see the right click menu despite it being enabled in text extractor's settings. Would u know why that's the case?

scambier commented 1 year ago

Are there error logs in the developer console ? cmd+shift+i (I think) on macOS

YousufSSyed commented 1 year ago

This is the output after having only Omnisearch and Text Extractor on and reloading my vault

Text Extractor - Number of available workers: 3 for PDFs, 2 for OCR
plugin:omnisearch:45 Omnisearch - 1412 files total
plugin:omnisearch:45 Omnisearch - Cache is enabled
pixi.min.js:8 Canvas2D: Multiple readback operations using getImageData are faster with the willReadFrequently attribute set to true. See: https://html.spec.whatwg.org/multipage/canvas.html#concept-canvas-will-read-frequently
t.measureFont @ pixi.min.js:8
t.measureText @ pixi.min.js:8
r.updateText @ pixi.min.js:8
r.updateTransform @ pixi.min.js:8
e.updateTransform @ pixi.min.js:8
e.updateTransform @ pixi.min.js:8
r.render @ pixi.min.js:8
t.render @ pixi.min.js:8
renderCallback @ app.js:1
plugin:omnisearch:45 Omnisearch - Loading index from cache: 695.68701171875 ms
plugin:omnisearch:45 Omnisearch - Total number of files to add/update: 93
tesseract-core-simd.wasm.js:33 Image too small to scale!! (2x36 vs min width of 3)
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a7e6
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
b._emscripten_bind_TessBaseAPI_Recognize_1 @ tesseract-core-simd.wasm.js:165
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
tesseract-core-simd.wasm.js:33 Line cannot be recognized!!
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a815
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
b._emscripten_bind_TessBaseAPI_Recognize_1 @ tesseract-core-simd.wasm.js:165
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
tesseract-core-simd.wasm.js:33 Image too small to scale!! (2x36 vs min width of 3)
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a7e6
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
b._emscripten_bind_TessBaseAPI_Recognize_1 @ tesseract-core-simd.wasm.js:165
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
tesseract-core-simd.wasm.js:33 Line cannot be recognized!!
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a815
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
b._emscripten_bind_TessBaseAPI_Recognize_1 @ tesseract-core-simd.wasm.js:165
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
tesseract-core-simd.wasm.js:33 Image too small to scale!! (1x36 vs min width of 3)
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a7e6
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
tesseract-core-simd.wasm.js:33 Line cannot be recognized!!
Wg @ tesseract-core-simd.wasm.js:33
write @ tesseract-core-simd.wasm.js:32
write @ tesseract-core-simd.wasm.js:59
r @ tesseract-core-simd.wasm.js:106
$func3070 @ 00d44c56:0x1f5082
$func1270 @ 00d44c56:0xb455c
$func800 @ 00d44c56:0x62054
$func50 @ 00d44c56:0x3547
$func3780 @ 00d44c56:0x26a815
$func1445 @ 00d44c56:0xe42c3
$func1072 @ 00d44c56:0x90efd
$func2283 @ 00d44c56:0x19cafb
$func2282 @ 00d44c56:0x1950f9
$func473 @ 00d44c56:0x31db9
$zd @ 00d44c56:0x2b6b7a
W.Recognize @ tesseract-core-simd.wasm.js:256
L @ index.js:209
e.dispatchHandlers @ index.js:304
(anonymous) @ index.js:20
plugin:omnisearch:36 Omnisearch - Search cache written
plugin:omnisearch:45 Omnisearch - Indexing total time: 153910.05004882812 ms
scambier commented 1 year ago

I don't see the right click menu despite it being enabled in text extractor's settings.

Does it happen with all images & PDFs? Is this image saved in you Obsidian vault?

dlardo commented 1 year ago

FWIW I had this same issue. I restarted Obsidian a few times and it fixed itself. I did notice a OCR worker timeout in the developer console but unfortunately I didn't grab a copy of the error to share, sorry about that.

YousufSSyed commented 1 year ago

@scambier I tried it again and it still has issues. I installed both plugins in another vault, right click and did both extract to clipboard and new note and they worked, but the text doesn't show up in "Omnisearch: vault search."

scambier commented 1 year ago

Is "Images Indexing" correctly enabled in Omnisearch? New files should be indexed as soon as they are created, or at worst after a restart.