felixbur / nkululeko

Machine learning speaker characteristics
MIT License
26 stars 4 forks source link

add wavlm #63

Closed bagustris closed 10 months ago

bagustris commented 10 months ago

Variants to be used for FEATS.type :

Example INI file (ravdess):

[EXP]
root = ./
name = results/exp_ravdess_hubert
runs = 1
epochs = 1
save = True
[DATA]
databases = ['train', 'test', 'dev']
train = ./data/ravdess/ravdess_train.csv
train.type = csv
train.absolute_path = False
train.split_strategy = train
dev = ./data/ravdess/ravdess_dev.csv
dev.type = csv
dev.absolute_path = False
dev.split_strategy = train
test = ./data/ravdess/ravdess_test.csv
test.type = csv
test.absolute_path = False
test.split_strategy = test
target = emotion
labels = ['angry', 'happy', 'neutral', 'sad']
[FEATS]
type = ['wavlm-large']
no_reuse = False
scale = standard
[MODEL]
type = svm

Results

(.env) bagus@pc-omen:nkululeko$ python3 -m nkululeko.nkululeko --config data/ravdess/exp_ravdess_w2v2_svm.ini 
DEBUG nkululeko: running results/exp_ravdess_hubert from config data/ravdess/exp_ravdess_w2v2_svm.ini, nkululeko version 0.62.1
DEBUG dataset: loading train
DEBUG dataset: num of speakers: 16
DEBUG dataset: Loaded database train with 960 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: train: loaded data with 960 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: loading test
DEBUG dataset: num of speakers: 4
DEBUG dataset: Loaded database test with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: test: loaded data with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: loading dev
DEBUG dataset: num of speakers: 4
DEBUG dataset: Loaded database dev with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: dev: loaded data with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG experiment: loaded databases train,test,dev
DEBUG experiment: reusing previously stored ./results/exp_ravdess_hubert/./store/testdf.csv and ./results/exp_ravdess_hubert/./store/traindf.csv
DEBUG experiment: value for filter.sample_selection not found, using default: all
DEBUG experiment: value for type not found, using default: dummy
DEBUG experiment: Categories test: ['sad' 'neutral' 'happy' 'angry']
DEBUG experiment: Categories train: ['angry' 'happy' 'sad' 'neutral']
DEBUG experiment: 4 speakers in test and 20 speakers in train
DEBUG nkululeko: train shape : (560, 5), test shape:(112, 5)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: reusing extracted wavlm-large embeddings
DEBUG feature_extractor: wavlm-large: shape : (560, 1024)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: reusing extracted wavlm-large embeddings
DEBUG feature_extractor: wavlm-large: shape : (112, 1024)
DEBUG experiment: All features: train shape : (560, 1024), test shape:(112, 1024)
DEBUG scaler: scaling features based on training set
DEBUG runmanager: run 0
DEBUG model: value for C_val not found, using default: 0.001
DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.922 UAR
DEBUG modelrunner: plotting confusion matrix to train_test_dev_svm_wavlm-large__0_000_cnf
DEBUG runmanager: value for measure not found, using default: uar
DEBUG reporter: labels: ['angry' 'happy' 'neutral' 'sad']
DEBUG reporter: result per class (F1 score): [0.954, 0.881, 0.938, 0.912]
DEBUG experiment: Done, used 2.831 seconds
DONE
(.env) bagus@pc-omen:nkululeko$ python3 -m nkululeko.nkululeko --config data/ravdess/exp_ravdess_w2v2_svm.ini 
DEBUG nkululeko: running results/exp_ravdess_hubert from config data/ravdess/exp_ravdess_w2v2_svm.ini, nkululeko version 0.62.1
DEBUG dataset: loading train
DEBUG dataset: num of speakers: 16
DEBUG dataset: Loaded database train with 960 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: train: loaded data with 960 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: loading test
DEBUG dataset: num of speakers: 4
DEBUG dataset: Loaded database test with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: test: loaded data with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: loading dev
DEBUG dataset: num of speakers: 4
DEBUG dataset: Loaded database dev with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG dataset: dev: loaded data with 240 samples: got targets: True, got speakers: True, got sexes: True
DEBUG experiment: loaded databases train,test,dev
DEBUG experiment: reusing previously stored ./results/exp_ravdess_hubert/./store/testdf.csv and ./results/exp_ravdess_hubert/./store/traindf.csv
DEBUG experiment: value for filter.sample_selection not found, using default: all
DEBUG experiment: value for type not found, using default: dummy
DEBUG experiment: Categories test: ['sad' 'neutral' 'happy' 'angry']
DEBUG experiment: Categories train: ['angry' 'happy' 'sad' 'neutral']
DEBUG experiment: 4 speakers in test and 20 speakers in train
DEBUG nkululeko: train shape : (560, 5), test shape:(112, 5)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: reusing extracted wavlm-large embeddings
DEBUG feature_extractor: wavlm-large: shape : (560, 1024)
DEBUG featureset: value for device not found, using default: cuda
DEBUG featureset: reusing extracted wavlm-large embeddings
DEBUG feature_extractor: wavlm-large: shape : (112, 1024)
DEBUG experiment: All features: train shape : (560, 1024), test shape:(112, 1024)
DEBUG scaler: scaling features based on training set
DEBUG runmanager: run 0
DEBUG model: value for C_val not found, using default: 0.001
DEBUG modelrunner: run: 0 epoch: 0: result: test: 0.922 UAR
DEBUG modelrunner: plotting confusion matrix to train_test_dev_svm_wavlm-large__0_000_cnf
DEBUG runmanager: value for measure not found, using default: uar
DEBUG reporter: labels: ['angry' 'happy' 'neutral' 'sad']
DEBUG reporter: result per class (F1 score): [0.954, 0.881, 0.938, 0.912]
DEBUG experiment: Done, used 2.870 seconds
DONE