Open Katerinavts opened 6 years ago
Hi! Nice topic! Seems like a pretty challenging topic covering both text and sentiment analysis.
I'm not sure how you are going to get info about the genre info of the songs, but I'm using spotify api for my current project and it's pretty awesome. Hope it helps!
Look forward to seeing your updates!
What a great idea! I wish you could expand the data set a bit so you had, say, top ten songs from each year, but you can do a lot with what you have too.
I'd love to know:
something is in the air with music analysis. for a 'compared to what' i would like to understand the difference between a hit and a flop. i have yet to be really surprised by, 'pop music is so similar.' tbh, that's what makes it pop music.
but i digress. if you can get lyric diversity here and find a story, that would be cool.
As we talked about the other day, it's really important to quantify exactly what you're looking for, especially when trying to do text analysis. One of the things you mentioned was "pop music is simple," which we decided could be measured by word length and number of unique words per song.
Pitch
It is mostly the music that makes a song a "hit". But what about the lyrics?
Artists and music critics argue that pop music has become more and more homogeneous. A Pudding piece examined the similarity of music over the decades, but I would like to focus on the lyrics.
I scrapped the lyrics of Billboard's Greatest of All Time Hot 100 Songs and will use text and sentiment analysis to determine patterns.
Since the dataset spans many decades, I would also like to understand:
Inspiration: https://www.nytimes.com/interactive/2017/03/09/magazine/25-songs-that-tell-us-where-music-is-going.html#/intro
Details
Possible headline(s): The lyric diversity of hit songs
Data set(s): https://www.billboard.com/charts/greatest-hot-100-singles https://www.azlyrics.com/
Code repository: https://github.com/Katerinavts/Data-Studio/blob/master/Scrapping%20100%20Hot%20Billboard%20Songs%20of%20all%20times_ver1.ipynb
Possible problems/fears/questions: Limits in visualization, since it is not going to be interactive
Work so far
I have scrapped and merged lyrics with the list of songs. I am currently cleaning the lyrics with regex and I am going to analyze the data.
Checklist
This checklist must be completed before you submit your draft.