atomicdata-dev / atomic-server

An open source headless CMS / real-time database. Powerful table editor, full-text search, and SDKs for JS / React / Svelte.
https://atomicserver.eu
MIT License
1.07k stars 49 forks source link

Extract text from imported (PDF / Word / Office) files #477

Open joepio opened 2 years ago

joepio commented 2 years ago

Being able to search inside the PDF files uploaded to Atomic Server would be a really nice addition.

Goals:

Non-goals:

There are some tools that could help with this:

AlexMikhalev commented 9 months ago

https://crates.io/crates/pdfium-render - new contender. Recently recommended in rust reddit.