MycroftAI / mycroft-precise

A lightweight, simple-to-use, RNN wake word listener
Apache License 2.0
815 stars 228 forks source link

Mycroft precise is it only working wake word process or is it able to achieve voice authentication? #46

Closed MuruganR96 closed 5 years ago

MuruganR96 commented 5 years ago

I working on that our Mycroft-precise, i was tested( precise-listen )for my generated model, it was predicting correctly for wake word for me and my friends voices as well as( xxxxxx---------------------------------- ). but i need voice authentication purpose wake-word mycroft-precise. i have basic knowledge about this voice authentication. how can i get audio features for particular person. and verify simultaneously? sir help me. give your suggestions, ideas. thank you for advance.

MatthewScholefield commented 5 years ago

Precise is only meant as a voice authentication tool. If you would like to experiment with voice authentication, you can try a few things:

Let me know if this helps, and be sure to keep us updated on any progress you make.

MuruganR96 commented 5 years ago

thank you sir.

Putting samples of your voice in the wake-word folder and other people's samples in the not-wake-word folder. i have to save wake-word folder, extract, and trained for my audio, it is possible in real-time. but

other people's samples in the not-wake-word folder dynamically other peoples's samples can't feasible to get in real-time.

i was tried in this issue for your suggestions sir, https://github.com/MycroftAI/mycroft-precise/issues/37#issuecomment-431266672

Try using the speech commands dataset(latest download here). It's an archive with a series of folders with a bunch of samples of different words. You can just drop that in the not-wake-word folder. speech commands datasets as non-wake-word. i am getting this results,

=== False Negatives ===

=== Counts ===
False Positives: 6905
True Negatives: 83
False Negatives: 0
True Positives: 304

=== Summary ===
387 out of 7292
5.31 %

98.81 % false positives
0.0 % false negatives

sir i have one doubt, if we add more general common non-wake-word audios like speech commands datasets, Public Domain Sounds Backup, is it satisfying other people's samples non-wake-word property.

meanwhile, i will do research on your second suggestion sir. and update the status.

thank you very much for your quick response sir.

MatthewScholefield commented 5 years ago

I would suggest recording some examples of your wakeword specifically spoken by other people to put in the not-wake-word folder. The reason is that it's easy to distinguish between you saying your own wake word and someone else saying a totally different word. It's much harder to distinguish two different people saying the same word. This is why it would probably help having even just a few samples of different voices saying your wake word in the not-wake-word folder.

penrods commented 5 years ago

You are looking for more than simple "wake word" and instead asking for "speaker identification". You might check out the work Google just released: https://venturebeat.com/2018/11/12/google-open-sources-ai-that-can-distinguish-between-voices-with-92-percent-accuracy/

If you do, I'm curious as to the results you see. I am guessing the approach Google released might be usable by Precise, generating both the wake-word trigger as well as a guess of what individual spoke it.

MuruganR96 commented 5 years ago

@MatthewScholefield
https://github.com/MycroftAI/mycroft-precise/issues/46#issuecomment-438144468 sir i tried that second suggestion,

i add one more GRU layer and change MFCC features as well as,

model = Sequential()
        model.add(GRU(
            params.recurrent_units, activation='tanh',
            input_shape=(pr.n_features, pr.feature_size), dropout=params.dropout, name='net',
            return_sequences=True
        ))
        model.add(GRU(
            params.recurrent_units, activation='linear', dropout=params.dropout,
        ))
        model.add(Dense(1, activation='sigmoid'))
pr = ListenerParams(
    window_t=0.1, hop_t=0.03, buffer_t=4.0, sample_rate=16000,
    sample_depth=2, n_mfcc=20, n_filt=50, n_fft=512, use_delta=False,
    vectorizer=Vectorizer.mfccs
)

i was getting better accuracy compare with previous.


Loading wake-word...
Loading not-wake-word...
Using TensorFlow backend.
Data: <TrainData wake_words=2128 not_wake_words=99267 test_wake_words=304 test_not_wake_words=7012>
=== False Positives ===

=== False Negatives ===

=== Counts ===
False Positives: 0
True Negatives: 7012
False Negatives: 0
True Positives: 304

=== Summary ===
7316 out of 7316
100.0 %

0.0 % false positives
0.0 % false negatives

and sir @penrods i saw that research paper https://arxiv.org/pdf/1810.04719.pdf. it almost same that mycroft wakeword processing. but we will do instance RNN with embeddings for different speakers. github link also is there. https://github.com/google/uis-rnn

now i research, how do i integrate microft-precise with uis-rnn.

thank you very much sir @MatthewScholefield and @penrods

MatthewScholefield commented 5 years ago

@MuruganR96 I would suggest getting uis-rnn first to work on it's own and then go forward from there.

MuruganR96 commented 5 years ago

Thank you sir. Now I am worked on that uis-rnn sir. I will update my status sir.

On Wed 14 Nov, 2018, 9:41 PM Matthew D. Scholefield < notifications@github.com wrote:

@MuruganR96 https://github.com/MuruganR96 I would suggest getting uis-rnn first to work on it's own and then go forward from there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MycroftAI/mycroft-precise/issues/46#issuecomment-438719014, or mute the thread https://github.com/notifications/unsubscribe-auth/AiT-IPfCdb8IVHm-YtJ4T8cBZph95rbHks5uvECXgaJpZM4Ya8wF .