steffenfritz / FileTrove

FileTrove indexes files and creates metadata from them.
https://filetrove.fritz.wtf
GNU Affero General Public License v3.0
28 stars 5 forks source link

[CHANGE] Text document indexing #4

Closed steffenfritz closed 8 months ago

steffenfritz commented 11 months ago

Describe the solution you'd like FileTrove should index text documents, remove stop words, identify relevant tokens and create a (small) list of words that give an idea that describes the text document.

To use xapian might be an idea.

steffenfritz commented 8 months ago

This is out of scope of FileTrove. There are other tools more suitable for such tasks. Nevertheless, the output of FT could help by identifying relevant files.