felixbur / nkululeko

Machine learning speaker characteristics
MIT License
26 stars 4 forks source link

Add working example for speaker verification #83

Closed bagustris closed 9 months ago

bagustris commented 9 months ago

Adding speaker verification example using Ravdess dataset (feature: spkrec-ecapa-voxceleb).

Steps for speaker verification.

$ cd data/ravdess
$ python3 process_database_speaker.py 
Total length: 1440, Train set: 1152, Test set: 288
$ cd ../..
$ python3 -m nkululeko.resample --config data/ravdess/exp_speaker.ini
$ python3 -m nkululeko.nkululeko --config data/ravdess/exp_speaker.ini

Sample outputs:

DEBUG nkululeko: running results/exp_ravdess_speaker from config data/ravdess/exp_speaker.ini, nkululeko version 0.66.6
DEBUG dataset: loading train
DEBUG dataset: value for audio_path not found, using default: 
DEBUG dataset: Loaded database train with 1152 samples: got targets: True, got speakers: True (24), got sexes: True, got age: False
DEBUG dataset: loading test
DEBUG dataset: value for audio_path not found, using default: 
DEBUG dataset: Loaded database test with 288 samples: got targets: True, got speakers: True (24), got sexes: True, got age: False
DEBUG experiment: target: speaker
DEBUG experiment: Target labels (user defined): ['spk01', 'spk02', 'spk03', 'spk04', 'spk05', 'spk06', 'spk07', 'spk08', 'spk09', 'spk10', 'spk11', 'spk12', 'spk13', 'spk14', 'spk15', 'spk16', 'spk17', 'spk18', 'spk19', 'spk20', 'spk21', 'spk22', 'spk23', 'spk24']
DEBUG experiment: loaded databases train,test
DEBUG experiment: reusing previously stored ./results/exp_ravdess_speaker/./store/testdf.csv and ./results/exp_ravdess_speaker/./store/traindf.csv
DEBUG experiment: value for filter.sample_selection not found, using default: all
DEBUG experiment: value for type not found, using default: dummy
DEBUG experiment: Categories test (nd.array): ['spk01' 'spk02' 'spk03' 'spk04' 'spk05' 'spk06' 'spk07' 'spk08' 'spk09'
 'spk10' 'spk11' 'spk12' 'spk13' 'spk14' 'spk15' 'spk16' 'spk17' 'spk18'
 'spk19' 'spk20' 'spk21' 'spk22' 'spk23' 'spk24']
DEBUG experiment: Categories train (nd.array): ['spk01' 'spk02' 'spk03' 'spk04' 'spk05' 'spk06' 'spk07' 'spk08' 'spk09'
 'spk10' 'spk11' 'spk12' 'spk13' 'spk14' 'spk15' 'spk16' 'spk17' 'spk18'
 'spk19' 'spk20' 'spk21' 'spk22' 'spk23' 'spk24']
DEBUG experiment: 24 speakers in test and 24 speakers in train
DEBUG nkululeko: train shape : (1152, 5), test shape:(288, 5)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: loading Spkrec model...
DEBUG featureset: value for Spkrec.model not found, using default: speechbrain/spkrec-ecapa-voxceleb
intialized SB model on cuda
DEBUG featureset: extracting Spkrec embeddings, this might take a while...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1152/1152 [02:06<00:00,  9.12it/s]
df shape: (1152, 192)
DEBUG feature_extractor: spkrec-ecapa-voxceleb: shape : (1152, 192)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: loading Spkrec model...
DEBUG featureset: value for Spkrec.model not found, using default: speechbrain/spkrec-ecapa-voxceleb
intialized SB model on cuda
DEBUG featureset: extracting Spkrec embeddings, this might take a while...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 288/288 [00:31<00:00,  9.10it/s]
df shape: (288, 192)
DEBUG feature_extractor: spkrec-ecapa-voxceleb: shape : (288, 192)
DEBUG experiment: All features: train shape : (1152, 192), test shape:(288, 192)
DEBUG scaler: scaling features based on training set
DEBUG runmanager: run 0
DEBUG model: value for C_val not found, using default: 0.001
DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.997 UAR
DEBUG modelrunner: plotting confusion matrix to train_test_svm_spkrec-ecapa-voxceleb__0_000_cnf
DEBUG reporter: epoch: 0, UAR: 0.9965277777777777, ACC: 0.9965277777777778
DEBUG runmanager: value for measure not found, using default: uar
DEBUG reporter: labels: ['spk01', 'spk02', 'spk03', 'spk04', 'spk05', 'spk06', 'spk07', 'spk08', 'spk09', 'spk10', 'spk11', 'spk12', 'spk13', 'spk14', 'spk15', 'spk16', 'spk17', 'spk18', 'spk19', 'spk20', 'spk21', 'spk22', 'spk23', 'spk24']
DEBUG reporter: result per class (F1 score): [1.0, 1.0, 1.0, 1.0, 1.0, 0.96, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.957, 1.0, 1.0, 1.0, 1.0]
DEBUG experiment: Done, used 1583.496 seconds
DONE

Also added gerparas example; however, ccc scores are very small (0.05),