X_train, y_train, X_validation, y_validation, consisting of shuffled data from the provided audio files (rows of X are our spectrograms and y are the labels). In addition, we want a list y_mapping which is a list of speakers in the order of the classes in y (ie. if y_mapping is ['speaker_1', 'speaker_2'], then (0, 1) would correspond to predicting speaker_2).
Input
A dictionary of speaker names mapped to lists of paths to audio files:
Output
X_train
,y_train
,X_validation
,y_validation
, consisting of shuffled data from the provided audio files (rows ofX
are our spectrograms andy
are the labels). In addition, we want a listy_mapping
which is a list of speakers in the order of the classes iny
(ie. ify_mapping
is['speaker_1', 'speaker_2']
, then(0, 1)
would correspond to predictingspeaker_2
).