This repository contains:
Please, cite this paper if you are using our dataset and code.
See examples of metadata analysis that can be done using metadata from Discogs.
This is the code that we used to create our release dataset and for our example studies presented in the ISMIR-2017's paper.
Run pip install -r requirements.txt
to install required dependencies.
config.py
: basic configuration script, contains some global variables (like filenames) used by other scriptspreprocess_releases_xml_to_json.py
: downloads the original XML dump archive and converts a subset of its metadata fields to a json dump.preprocess_releases_json_to_hdf_pandas.py
: further simplifies the metadata removing and recoding some fields, and outputs a HDF file with a pandas DataFrame.analyze.py
: a collection of useful functions for analysis of the dataset.