NovaFrost / SHS100K

metadata for SHS100K
21 stars 7 forks source link



This repository contains the metadata for Second Hands Songs 100K (SHS100K) dataset. This dataset contains about 10,000 songs with 100,000 tracks. It splits into three sets: SHS100K-TRAIN, SHS-VAL and SHS-TRAIN. The metadata is provided in list. One could crawl raw audio through youtube-dl using the provided urls.

Metadata for Second Hand Songs 100K Dataset

List contains the metadata of this dataset, including the set_id, ver_id, title, performer, url, status. 
set_id: Index of the song
ver_id: Index of different versions
title: Song's name
perfomer: Performer's name
url: YouTube urls
status: Whether the track could be downloaded by youtube-dl

Training Set

SHS100K-TRAIN: only contains set_id and ver_id

Validation Set

SHS100K-VAL: only contains set_id and ver_id

Test Set

SHS100K-TEST: only contains set_id and ver_id