Some things to think about

The NN model appends all songs and then uses the previous words for the trigrams. It might make more sense to use “START” and “END” for the first and last words of the song (because they are in no way related to the previous/next song), which is what I did for the basic ngram one.
We do quite a lot of preprocessing with the lemmatization, stop word removal and removing punctuation. Because of this, the input to our model is not grammatically correct. I did the same in the ngram NLTK basic one for consistency, but it’s probably good to think about this since how can we expect our model to generate a song that makes sense if it isn’t given a song that makes sense.

Brahex / text-mining-final-project