zadam / trilium

Build your personal knowledge base with Trilium Notes
GNU Affero General Public License v3.0
27.2k stars 1.9k forks source link

(Feature request) Can we search in the attached office files? #4264

Closed metorm closed 5 months ago

metorm commented 1 year ago

Describe feature

Sometimes we want to search for some text in the attached office files, typically docx of PDF file.

Is it possible? Or where can we start to write a script to do this?

Additional Information

No response

zerebos commented 1 year ago

The find widget does not work on file type notes afaik. Writing a script for this is possible on desktop using electron's native api pretty easily and I've been tinkering with this. markedjs which is what is using for the find widget sometimes struggles with iframes which is whats used for things like PDF files.

zerebos commented 1 year ago

So after some tinkering, it definitely seems like this is a viable idea

https://github.com/zadam/trilium/assets/6865942/75a44f09-3444-45da-9755-d33007fbb544

But I have some more work to do to mitigate the jankiness when searching PDFs specifically

https://github.com/zadam/trilium/assets/6865942/633bc74e-ec8d-4a65-b4fa-59f35521aac0

metorm commented 1 year ago

I look forward for this function, thank you for your contribution.

zadam commented 1 year ago

A while ago I experimented with parsing PDF files and extracting text nodes which were then saved in searchable attachments. But it didn't get beyond prototype and it wouldn't work with e.g. docx which would require a custom implementation.

meichthys commented 5 months ago

Trilium has entered maintenance mode. Future enhancements will be addressed in TrilumNext: https://github.com/TriliumNext/Notes/issues/76