HarryVolek / PyTorch_Speaker_Verification

PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
BSD 3-Clause "New" or "Revised" License
575 stars 166 forks source link

N = 1 --> EER : 0.00 (thres:0.00, FAR:0.00, FRR:0.00) for enrollment and verification.??? #22

Closed MuruganR96 closed 5 years ago

MuruganR96 commented 5 years ago

Number of speakers in batch N = 1 then, EER : 0.00 (thres:0.00, FAR:0.00, FRR:0.00)

torch.Size([1, 4, 160, 40]) mel_db_batch.size()
4 mel_db_batch.size(1)
0 batch_id
torch.Size([1, 2, 160, 40]) enrollment_batch size
torch.Size([1, 2, 160, 40]) verification_batch size
torch.Size([2, 160, 40]) reshape tensor enrollment_batch size
torch.Size([2, 160, 40]) reshape tensor verification_batch size
2 reshape tensor verification_batch size(0)
[0, 1] perm
[0, 1] unperm
torch.Size([2, 160, 40]) verification_batch
torch.Size([2, 256]) enrollment_embeddings
torch.Size([2, 256]) verification_embeddings
torch.Size([2, 256]) verification_embeddings
torch.Size([1, 2, 256]) enrollment_embeddings.size()
torch.Size([1, 2, 256]) verification_embeddings.size()
torch.Size([1, 256]) enrollment_centroids.size()
tensor([[[0.7973],
         [0.7973]]], grad_fn=<CopySlices>) sim_matrix
torch.Size([1, 2, 1]) sim_matrix.size()

EER : 0.00 (thres:0.00, FAR:0.00, FRR:0.00)

EER across 10 epochs: 0.0000

if N > 1 means, EER : 0.01 (thres:0.60, FAR:0.03, FRR:0.00)

EER : 0.00 (thres:0.71, FAR:0.00, FRR:0.00)
torch.Size([4, 6, 160, 40]) mel_db_batch.size()
6 mel_db_batch.size(1)
11 batch_id
torch.Size([4, 3, 160, 40]) enrollment_batch size
torch.Size([4, 3, 160, 40]) verification_batch size
torch.Size([12, 160, 40]) reshape tensor enrollment_batch size
torch.Size([12, 160, 40]) reshape tensor verification_batch size
12 reshape tensor verification_batch size(0)
[2, 6, 5, 4, 7, 8, 9, 1, 0, 10, 3, 11] perm
[8, 7, 0, 10, 3, 2, 1, 4, 5, 6, 9, 11] unperm
torch.Size([12, 160, 40]) verification_batch
torch.Size([12, 256]) enrollment_embeddings
torch.Size([12, 256]) verification_embeddings
torch.Size([12, 256]) verification_embeddings
torch.Size([4, 3, 256]) enrollment_embeddings.size()
torch.Size([4, 3, 256]) verification_embeddings.size()
torch.Size([4, 256]) enrollment_centroids.size()

torch.Size([4, 3, 4]) sim_matrix.size()

EER : 0.01 (thres:0.60, FAR:0.03, FRR:0.00)

How can i fit best threshold for single speaker enrollment vs inference in testing.

@HarryVolek Thanks sir.

FengLeee commented 5 years ago

you can read the utils.py file in line 42,when speaker_num == centroid_num,the code will recalculate the centroid use the verification embedding , so if you just hava one verification embedding ,the centroid is him self .