Open kernpunkt-thermann opened 7 years ago
Hi!
That would require a lot of work, and it's not planned right now.
I would welcome any PR that tries to add this feature.
If anyone is trying to do this it looks like this tool might be helpful: https://www.npmjs.com/package/pdf-text-extract
The problem with that approach is that we cannot use css selector to find the content to index. But it is a start! Thanks for sharing.
Hi,
any ideas/plans about crawling Documents, especially PDFs?
Regards from germany