Vaishnavi Behl - Final Assignment - MY472

Project Description

This repository contains my final summative assignment for the MY472 course. The project explores how the Rolling Stone Magazine's 100 greatest musical artists of all time have endured over the years. It explores whether there are any features or characteristics that seem to explain enduring engagement or decline in popularity.

Installation

To run the analysis, you need to have the following R packages:

httr
jsonlite
dplyr
RSelenium
rvest
spotifyr
ggplot2
readr
tidyverse
stringr
ggrepel
reshape2
corrplot

Data Sources

This project utilizes data from three main sources:

Rolling Stone's 100 Greatest Artists: A list compiled by Rolling Stone magazine, detailing the 100 greatest music artists of all time. This list can be found at Rolling Stone's website.
Spotify Web API: Data retrieved from Spotify's Web API, providing detailed information about songs, artists, and user listening habits.
Wikipedia - List of Best-Selling Music Artists: A comprehensive list of the best-selling music artists, sourced from Wikipedia. This list includes artists with claims of 75 million or more record sales.

Each of these sources contributes unique data that supports the analysis conducted in this project.

Data Files

This project incorporates several CSV files, each playing a crucial role in the analysis:

final_data.csv: This comprehensive file is the culmination of the data processing and analysis performed in this project. It includes aggregated and refined data from various sources, tailored to support the specific analytical goals of this study. This file is key for the final data visualization and interpretation stages.
rolling_stone_artists.csv: Sourced from Rolling Stone's list of the 100 Greatest Artists, this file contains detailed information on these top artists. It includes data such as artist names, their ranking on the list, and other relevant attributes that are crucial for comparative analysis with other music-related datasets.
spotify_data.csv: A robust dataset obtained from the Spotify Web API, this file encompasses a wide range of data points related to music tracks, artists, and user preferences. This dataset is essential for understanding patterns in music consumption and artist popularity.
spotify_sample_data.csv: This is a subset of the larger Spotify dataset, used primarily for initial testing, exploratory data analysis, and method validation. It provides a manageable snapshot of the larger dataset, enabling preliminary assessments and development of analytical approaches.

Each CSV file serves a distinct purpose, from providing foundational data to supporting complex analyses and visualizations, forming the backbone of this research project.