how to use this project?

dominoanty / SpeakerRecognition

Implementing speaker recognition using Python (GMM-UBM)

29 stars 14 forks source link

how to use this project? #1

Open akshat9425 opened 5 years ago

akshat9425 commented 5 years ago

Hi,

i am searching to implement speaker verification will you please tell me what paths are needed to set in your project and what folders are needed to build

i was confused that why you wrote self.spk.append(wavfile.read('data/train.wav')) in main_old file

please tell me step by step that how could i run your repo

please reply thanks

akshat9425 commented 5 years ago

I used it that was perfect but will you help me by answering that how could i integrate it for login and registration with voice

dominoanty commented 5 years ago

Hey @akshat9425. Sorry for the late reply. I'm glad you were able to use it! I must add some documentation I suppose. I didn't expect anyone to actually reach here!

About the login part. Maybe an approach can be to pickle (allows you to save trained models) the GMM into a location and store that path along with user details in your user table. I'm not sure about the UBM though. What I had done was trained the UBM on all the data that I had. For this approach, you'd need some sort of UBM as well, which also I recommend you pickle and store as a file.

My approach, if i remember correctly, was to build a confusion matrix and then pick the column with the highest match. You can probably just extract MFCC from test voice and then run it through all the GMM-UBM and select the one with highest matches.

Let me know if you need any more information.

akshat9425 commented 5 years ago

@dominoanty i did some changes in code i.e test the voice that is not present in dataset and i want result that its not present but it actually gives output

For speaker /home/user/new_virtual_env/demo_check/data/english1.wav, best guess is english5.wav

Here is the code i edited

print("For speaker {}, best guess is {}".format("/home/user/new_virtual_env/demo_check/data/english1.wav", SPEAKERS[best_guess]))

please tell me where i was missing

dominoanty commented 5 years ago

You can probably do some kind of thresholding such that if that the match is not "enough" of a match, then identify the speaker as external.

akshat9425 commented 5 years ago

if i follow your instruction and we are having alot of voices than in that condition its not actually possible to implement threshold just check it from your side