First try to explore the data is to create a bag of words representation for each song, the data is already on {word: count} format but we need to convert it to numpy arrays, possibly also include an option to cut the number of words (will be hard to deal with the 5k words in the dataset)
First try to explore the data is to create a bag of words representation for each song, the data is already on
{word: count}
format but we need to convert it to numpy arrays, possibly also include an option to cut the number of words (will be hard to deal with the 5k words in the dataset)