Closed swilli6 closed 2 years ago
I was hoping to get started with this after I got the multi-word search and the wildcard search working pretty quickly. However, I am having trouble installing pke and the nltk resources it requires, so I don't know if I'll be able to work on this. I can't find a way to install the nltk stopwords, the universal tags, and the English language model on a Mac for some reason :/
It behaves almost like the resources don't exist at all
Oh boy it took a lot of wrangling and desperation, but I managed to get it working! I found a solution for installing the resources form nltk and spacy. I used the current version of lyrics2.txt in our repo as a document to extract themes from and this is the output:
I have no idea what 'cornflake girl lyrics' refers to xD
Apparently, a song by Tori Amos :D I wonder, what the sah is Anyhow, that looks really interesting - especially if we had even more songs in the data file. Maybe we could show these themes on the initial search page (before searching for anything)?
Yeah I think that is a good idea! Let's test out the theme extraction for a larger index and see if the results seem at all useful or sensible.
Let's see if it's a useful feature for our project