scambier / obsidian-text-extractor

A (companion) plugin to facilitate the extraction of text from images (OCR) and PDFs.
GNU General Public License v3.0
346 stars 19 forks source link

[Feature request] Support indexing HTML files #8

Open Quorafind opened 2 years ago

Quorafind commented 2 years ago

Is your feature request related to a problem? Please describe.

Now, omnisearch is working with all plaintext files, can it work with HTML file? I always save html by using save page WE .

Describe the solution you'd like

Treat html files as Txt files.

Describe alternatives you've considered

Use grep to request files.

Additional context

No.

scambier commented 2 years ago

can it work with HTML file?

Yes, but actually no. Omnisearch will index all "words" in html files, including tags and attributes. HTML files need to be indexed as a special case, like PDFs.