BlueWaveTechnologies / BlueWave

Web application used to create charts and dashboards using a graph database
MIT License
5 stars 3 forks source link

DocumentService - Index uploaded files #218

Closed pborissow closed 2 years ago

pborissow commented 2 years ago

Create a lucene index of files uploaded to the server (i.e. upload directory). The DocumentService will create an index on start up (if one does not exist) and updated it as new files are uploaded. The lucene index will be persisted on disk.

The index will be restricted to PDF files for now.

We should be able to use PDFBox to extract text from the PDF documents and use the text to populate the index. More here: https://stackoverflow.com/a/23762284/