modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Apache License 2.0
938 stars 77 forks source link

Inference index info in indentification from trained model #120

Closed Tortoise17 closed 3 weeks ago

Tortoise17 commented 3 weeks ago

I have tried

python speakerlab/bin/infer_sv.py --model_id $model_id --wavs input.wav

This exports the numpy array file. How can I get the inference info from trained model that this object is corresponding to which speaker number or which speaker in terms of identification?

If you could guide me.

yfchenlucky commented 3 weeks ago

https://github.com/modelscope/3D-Speaker/blob/main/speakerlab/bin/infer_sv.py#L290-L291 You can extract the corresponding embeddings according to the order of enroll and test sets in your wav_path.

Tortoise17 commented 3 weeks ago

@yfchenlucky Great, thank you. If I have to match one wav with files form folder, which has multiple. Could this be possible? to make closest top1-2 matches.?

Tortoise17 commented 3 weeks ago

I guess little change in pipeline will match the possible closest index. Thank you again. If there is still problem, I will request for help.

yfchenlucky commented 3 weeks ago

There are many ways. If you don't want to change the code, you can construct wav1, wav2\n wav1 wav3\n and so on, then test pairs and calculate the scores. Or you can simply modify infer_sv.py by fixing the enroll wav and constantly changing the test wav.

Tortoise17 commented 3 weeks ago

@yfchenlucky Great help. Really excellent work. Super top architecture models.