Closed tonytonyissaissa closed 5 years ago
Hi @tonytonyissaissa .
The inference can be used for speaker identification. To do so, take a waveform of the speaker you want to identify, run it through the model, and store the embedding ("enrollment"). Compare future waveforms to your collection of embeddings. Take a look at the test function in train_speech_embedder.py for an example of how to compute similarity between embeddings.
Thank you @HarryVolek What is the value of cossim threshold which should I take for inference ? (that is if cossim(embedding_wav1, embedding_wav2) > threshold => wav1 and wav2 belong to the same speaker). is it EER_thresh from testing phase ?
Pick the threshold which performs best. EER_thresh should be an indicator of this threshold.
@tonytonyissaissa have you referenceed this project succssfully?
Hi @HarryVolek,
I trained the model correctly. But now I have some .wav as inputs so how can I use the trained model in order to do inference ? Also, could the inference be used for speaker identification (verify if the utterance belongs to one of a set of N speakers) or is it just valid for speaker verification (verify if the utterance is for the claimed speaker) ? Thanks in advance, Tony,