mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.
MIT License
1.11k stars 260 forks source link

What "compute_d_vectors.py" is really about? #92

Open briverse17 opened 3 years ago

briverse17 commented 3 years ago

Hi Mr. @mravanelli and everyone interested in this repo,

As what I have learned so far, speakers' d-vectors are the feature vectors for each of those speakers extracted by a deep learning model. We then can leverage them on many tasks: identification, verification, diarization, etc. However, at the very sight, Sincnet is not about computing d-vectors. Like, in the speaker_id experiment, it's an end-to-end method to solve the identification problem. All the speakers' features are combined in one model. There's no sub-process of the recipe that involves in computing d-vectors.

Then, what is the implication of "compute_d_vectors.py"? As Mr. Ravanelli explained in the comments, the script will compute d-vectors from test files using the pretrained speaker_id model; each file's name and d-vector will be saved into a dictionary. What are those d-vectors used for, whilst we didn't compute d-vectors for each of the speakers in the training of speaker_id model? We probably don't have anything to compare those test files' d-vectors with.

Sorry if this is a silly question. And thanks in advance for any answer.

Regards,

Vu Nguyen.

dheerajbiswas commented 1 year ago

Hi Mr. M. Ravanelli and all,

I have gone through your paper on SincNet and codes avalilable. You have provided the results for Speaker Verification in the paper but did not provide code for the same. I found d-vectors are being calculated in the "compute_d_vectors.py" but the second DNN module is not used to give verification results. Assigning N speaker classes to 2-class is a deal in neural networks. Can you please provide the speaker verification code.