[BUG] took a longtime than before to do the text extraction.

scambier / obsidian-text-extractor

A (companion) plugin to facilitate the extraction of text from images (OCR) and PDFs.

GNU General Public License v3.0

346 stars 19 forks source link

[BUG] took a longtime than before to do the text extraction. #23

Closed ccchan234 closed 1 year ago

ccchan234 commented 1 year ago

previously working well.

today cant write to file or to clipboard. all result in undefined.

i.e. the resulting file only contain:

undefined

then a link to the photo.

thank you

ccchan234 commented 1 year ago

hi, it worked after a certain time, i will say about 5mins.

may i ask is the OCR process dont locally?

i usually will block obsidian's internet connection by firewall.

today this let me feel that it's not done locally? thanks

ccchan234 commented 1 year ago

now functions quite well.

strange.... i'll close the ticket and see.

thanks

scambier commented 1 year ago

BTW, text extraction is 100% done locally. There are a few parallel worker threads, and it can happen that they're all stuck for some reason. So the process is stalled until the timeout limit kicks in to start working on the next file.

ccchan234 commented 1 year ago

hi, it always result in undefined if i block obsidian's internet access.

but it worked immediately if i enabled the internet access in firewall.

could it be something that will cause this?

is the plugin completely open source?

any library it use may need online?

thanks

i am simply a student who striving for survival, nothing about national security is in my PC, yet.

ccchan234 commented 1 year ago

well, i did could backup your claim that text extraction 100% local, because after that i disabled the obsidian's connection, and render ANOTHER file, it success.

quite strange, but i'll still use it. thanks for that.

scambier commented 1 year ago

I see those are screenshots from a phone, and Text Extractor does not work on phones. It relies on cache synchronization with a computer to get the text values. See this point in the readme.

All the source code is available on this repository. Dependencies are also open source; OCR relies on Tesseract.js.

ccchan234 commented 1 year ago

hi, those screenshot are took from phones, but then i moved them onto windows machines.
i saw a "clear cache" button but after click it seems not working. i tried uninstall the plugin and re-install but the cache seems still there. i will try uninstall the plugin, then delete the plugin folder under /.obsidian and then reinstall

so far it's working for me i'll stay /w this, thanks

ccchan234 commented 1 year ago

I see those are screenshots from a phone, and Text Extractor does not work on phones. It relies on cache synchronization with a computer to get the text values. See this point in the readme.

All the source code is available on this repository. Dependencies are also open source; OCR relies on Tesseract.js.

ok saw your point on cache, may be it's malfunctioning causing result in "undefined". may be i'll powerwash everything later, thanks