RaviSoji / plda

Probabilistic Linear Discriminant Analysis & classification, written in Python.
https://ravisoji.com
Apache License 2.0
127 stars 31 forks source link

enrollment of model #59

Closed zabir-nabil closed 2 years ago

zabir-nabil commented 2 years ago

PLDA is often used for biometric tasks, where we need to enroll a set of models. Based on those enrollments, we can generate scores. In your setup, how the enrollment will occur?

RaviSoji commented 2 years ago

Sorry, I never received a notification about this issue. Can you say more about what you hope to achieve with the model?

Ravi B. Sojitra

zabir-nabil commented 2 years ago

I am mainly working in speaker recognition, I have trained a x-vector (neural network embedding) system [audio -> mfcc feature -> neural network embedding [512] -> 512-d embedding vector], I want to train the plda on top of these embedding vectors, is your implementation suitable for such tasks?

RaviSoji commented 2 years ago

Yes, that should work if you have enough labeled data per category. In the past, I've used this model to do similar things for image recognition.

Good luck! Ravi B. Sojitra

oplatek commented 2 years ago

Hi @RaviSoji,

I think that @zabir-nabil meant using enrolment like the snippet below from SpeechBrain. Notice that PLDA training and enrolment statistics are used in a separate step.

I do not see how to perform the enrolment with multiple utterances as well to be honest.

The multi-utterance enrollment process is described in the article PLDA-based Speaker Verification in Multi-Enrollment Scenario using Expected Vector Approach. Sorry, I did not find a simpler explanation for the enrollment process

from speechbrain.processing.PLDA_LDA import *
>>> import random, numpy
>>> dim, N = 10, 100
>>> n_spkrs = 10
>>> train_xv = numpy.random.rand(N, dim)
>>> md = ['md'+str(random.randrange(1,n_spkrs,1)) for i in range(N)]
>>> modelset = numpy.array(md, dtype="|O")
>>> sg = ['sg'+str(i) for i in range(N)]
>>> segset = numpy.array(sg, dtype="|O")
>>> s = numpy.array([None] * N)
>>> stat0 = numpy.array([[1.0]]* N)
>>> xvectors_stat = StatObject_SB(modelset=modelset, segset=segset, start=s, stop=s, stat0=stat0, stat1=train_xv)
>>> # Training PLDA model: M ~ (mean, F, Sigma)
>>> plda = PLDA(rank_f=5)
>>> plda.plda(xvectors_stat)
>>> print (plda.mean.shape)
(10,)
>>> print (plda.F.shape)
(10, 5)
>>> print (plda.Sigma.shape)
(10, 10)
>>> # Enrollment (20 utts), Test (30 utts)
>>> en_N = 20
>>> en_xv = numpy.random.rand(en_N, dim)
>>> en_sgs = ['en'+str(i) for i in range(en_N)]
>>> en_sets = numpy.array(en_sgs, dtype="|O")
>>> en_s = numpy.array([None] * en_N)
>>> en_stat0 = numpy.array([[1.0]]* en_N)
>>> en_stat = StatObject_SB(modelset=en_sets, segset=en_sets, start=en_s, stop=en_s, stat0=en_stat0, stat1=en_xv)
>>> te_N = 30
>>> te_xv = numpy.random.rand(te_N, dim)
>>> te_sgs = ['te'+str(i) for i in range(te_N)]
>>> te_sets = numpy.array(te_sgs, dtype="|O")
>>> te_s = numpy.array([None] * te_N)
>>> te_stat0 = numpy.array([[1.0]]* te_N)
>>> te_stat = StatObject_SB(modelset=te_sets, segset=te_sets, start=te_s, stop=te_s, stat0=te_stat0, stat1=te_xv)
>>> ndx = Ndx(models=en_sets, testsegs=te_sets)
>>> # PLDA Scoring
>>> scores_plda = fast_PLDA_scoring(en_stat, te_stat, ndx, plda.mean, plda.F, plda.Sigma)
>>> print (scores_plda.scoremat.shape)
(20, 30)