Walleclipse / Deep_Speaker-speaker_recognition_system

Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
245 stars 81 forks source link

Confused with the similarity results #76

Closed fabianoluzbr closed 2 years ago

fabianoluzbr commented 3 years ago

Hello guys,

I am using this code here but I have noticed that each time I run the test the result is different and they are very non-deterministic. For example: for the same pair, sometimes the similarity returns 0.7 another time returns 0.10, am I doing something wrong?

PS: I am using the model with 74 speaker and 30 seconds of speech for each one.

best

Walleclipse commented 3 years ago

Hi, I think the main reason is the random clip in def clipped_autio in the code. The audio was random clipping before feeding into the model if the audio is too long.
You can modify the def clipped_autio to a deterministic clip. (clip the deterministic middle frames of the long audio, or clip different parts of the long audio then average the different results).
Or fixed the random seed during running the code, according to reproducible-results-neural-networks-keras.