tyiannak / pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Apache License 2.0
5.76k stars 1.18k forks source link

speaker_diarization - clarification #348

Open agmicha opened 3 years ago

agmicha commented 3 years ago

I analyzed speaker_diarization and I noticed something that makes me wonder. Why predictions from knn_speaker_10 and knn_speaker_male_female are not used in clustering?

K-means use only part of features [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53], so in fact use only: mfcc_1_mean - mfcc_13_mean, delta spectral_rolloff_mean, delta mfcc_1_mean - delta mfcc_12_mean. No predictions from above models. Is this selection correct? If is, why before are made predictions on KNNs models?

Xiaoping777 commented 2 years ago

I following the clustering / Kmeans method to segment male / female speakers, the performance is pretty bad, not sure if it is the similar issue.