Mycroft precise is it only working wake word process or is it able to achieve voice authentication?

MuruganR96 commented 5 years ago

I working on that our Mycroft-precise, i was tested( precise-listen )for my generated model, it was predicting correctly for wake word for me and my friends voices as well as( xxxxxx---------------------------------- ). but i need voice authentication purpose wake-word mycroft-precise. i have basic knowledge about this voice authentication. how can i get audio features for particular person. and verify simultaneously? sir help me. give your suggestions, ideas. thank you for advance.

MatthewScholefield commented 5 years ago

Precise is only meant as a voice authentication tool. If you would like to experiment with voice authentication, you can try a few things:

Putting samples of your voice in the wake-word folder and other people's samples in the not-wake-word folder
Changing the MFCC parameters to be more specific. If you take a look at how they are currently calculated you will see a lot of information is lost through the various stages of processing. You might need to find parameters that save more of the information relevant to the differences between speakers.

Let me know if this helps, and be sure to keep us updated on any progress you make.

MuruganR96 commented 5 years ago

thank you sir.

Putting samples of your voice in the wake-word folder and other people's samples in the not-wake-word folder. i have to save wake-word folder, extract, and trained for my audio, it is possible in real-time. but

other people's samples in the not-wake-word folder dynamically other peoples's samples can't feasible to get in real-time.

i was tried in this issue for your suggestions sir, https://github.com/MycroftAI/mycroft-precise/issues/37#issuecomment-431266672

Try using the speech commands dataset(latest download here). It's an archive with a series of folders with a bunch of samples of different words. You can just drop that in the not-wake-word folder. speech commands datasets as non-wake-word. i am getting this results,

=== False Negatives ===

=== Counts ===
False Positives: 6905
True Negatives: 83
False Negatives: 0
True Positives: 304

=== Summary ===
387 out of 7292
5.31 %

98.81 % false positives
0.0 % false negatives

sir i have one doubt, if we add more general common non-wake-word audios like speech commands datasets, Public Domain Sounds Backup, is it satisfying other people's samples non-wake-word property.

meanwhile, i will do research on your second suggestion sir. and update the status.

thank you very much for your quick response sir.

MatthewScholefield commented 5 years ago

I would suggest recording some examples of your wakeword specifically spoken by other people to put in the not-wake-word folder. The reason is that it's easy to distinguish between you saying your own wake word and someone else saying a totally different word. It's much harder to distinguish two different people saying the same word. This is why it would probably help having even just a few samples of different voices saying your wake word in the not-wake-word folder.

penrods commented 5 years ago

You are looking for more than simple "wake word" and instead asking for "speaker identification". You might check out the work Google just released: https://venturebeat.com/2018/11/12/google-open-sources-ai-that-can-distinguish-between-voices-with-92-percent-accuracy/

If you do, I'm curious as to the results you see. I am guessing the approach Google released might be usable by Precise, generating both the wake-word trigger as well as a guess of what individual spoke it.

MuruganR96 commented 5 years ago

@MatthewScholefield
https://github.com/MycroftAI/mycroft-precise/issues/46#issuecomment-438144468 sir i tried that second suggestion,

i add one more GRU layer and change MFCC features as well as,

model = Sequential()
        model.add(GRU(
            params.recurrent_units, activation='tanh',
            input_shape=(pr.n_features, pr.feature_size), dropout=params.dropout, name='net',
            return_sequences=True
        ))
        model.add(GRU(
            params.recurrent_units, activation='linear', dropout=params.dropout,
        ))
        model.add(Dense(1, activation='sigmoid'))

pr = ListenerParams(
    window_t=0.1, hop_t=0.03, buffer_t=4.0, sample_rate=16000,
    sample_depth=2, n_mfcc=20, n_filt=50, n_fft=512, use_delta=False,
    vectorizer=Vectorizer.mfccs
)

i was getting better accuracy compare with previous.


Loading wake-word...
Loading not-wake-word...
Using TensorFlow backend.
Data: <TrainData wake_words=2128 not_wake_words=99267 test_wake_words=304 test_not_wake_words=7012>
=== False Positives ===

=== False Negatives ===

=== Counts ===
False Positives: 0
True Negatives: 7012
False Negatives: 0
True Positives: 304

=== Summary ===
7316 out of 7316
100.0 %

0.0 % false positives
0.0 % false negatives

and sir @penrods i saw that research paper https://arxiv.org/pdf/1810.04719.pdf. it almost same that mycroft wakeword processing. but we will do instance RNN with embeddings for different speakers. github link also is there. https://github.com/google/uis-rnn

now i research, how do i integrate microft-precise with uis-rnn.

thank you very much sir @MatthewScholefield and @penrods

MatthewScholefield commented 5 years ago

@MuruganR96 I would suggest getting uis-rnn first to work on it's own and then go forward from there.

MuruganR96 commented 5 years ago

Thank you sir. Now I am worked on that uis-rnn sir. I will update my status sir.

On Wed 14 Nov, 2018, 9:41 PM Matthew D. Scholefield < notifications@github.com wrote:

@MuruganR96 https://github.com/MuruganR96 I would suggest getting uis-rnn first to work on it's own and then go forward from there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MycroftAI/mycroft-precise/issues/46#issuecomment-438719014, or mute the thread https://github.com/notifications/unsubscribe-auth/AiT-IPfCdb8IVHm-YtJ4T8cBZph95rbHks5uvECXgaJpZM4Ya8wF .

MycroftAI / mycroft-precise

Mycroft precise is it only working wake word process or is it able to achieve voice authentication? #46