MycroftAI / mycroft-precise

A lightweight, simple-to-use, RNN wake word listener
Apache License 2.0
854 stars 230 forks source link

Precise Custom Wake Word mic to be used and additional data set training #53

Closed tech-novic closed 5 years ago

tech-novic commented 5 years ago

I trained model with 200 + dataset (about 15 sample from different individuals) usind method 2 to take out false negative. Now the model when tested works good with the mic i used to collect the dataset. When i use other mics then the result is different, when i use laptop mic there is lot of false positive when i use another mic then to get activation is very hard. I also exported the model to tensorflow and used it with 6 array mic and couldnt good activation. There are 2 questions here,

  1. Is there any dependency on the mic used for capturing the dataset and what is the recommendation?
  2. How do I train the same model with new dataset?
MatthewScholefield commented 5 years ago

I have indeed noticed a dependency on the mic used. This could be both because of the different noise frequencies it picks up, and because of the different volumes. Apart from recording samples on a few different mics, you can try recording a long audio of silence on a few other mics (almost like getting the "noise profile") and using precise-add-noise to create a dataset with the noise of all the mics.

However, this might not solve the volume issue. Thinking about it, I should probably add a feature to precise-add-noise to randomly vary the output volume.

Edit: And you can train the same model on a new dataset just like you would expect (passing the name of the model in the command, but passing in the folder with the new dataset).

Let me know if this helps.

tech-novic commented 5 years ago

Thanks Matthew, i will try first with collecting noise from different mic and also plan to collect data from different mic sources. I will keep you posted on the outcome.

tech-novic commented 5 years ago

Have one question on precise-add-noise, will the syntax be precise-add-noise path to folder containing wake word dataset path to folder containing noise data set path to folder to write output

After this I assume I should use precise-train path to output from above

MatthewScholefield commented 5 years ago

You can always run the command with --help to check how to use it:

$ precise-add-noise --help
usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR]
                         [-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH]
                         folder noise_folder output_folder

Create a duplicate dataset with added noise

positional arguments:
  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

optional arguments:
  -h, --help            show this help message and exit
  -tg TAGS_FILE, --tags-file TAGS_FILE
                        Tags file to optionally load from. Default: -
  -if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR
                        The number of noisy samples generated per single
                        source sample. Default: 1
  -nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW
                        Minimum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.0
  -nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH
                        Maximum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.4

As you can see, the 3 required arguments, in order, are:

  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in precise-train to this new, duplicated dataset.

tech-novic commented 5 years ago

Hi Matthew, Update from using precise-add-noise. I collected noise sample as recommended from different mic and generated output wav file. Then i did training and when i tested the model with precise-listen i got lot of false activation. Then I did a incremental training with data\random (method 2). This resolved the false activation and when i tested with precise-listen using different mic it gave good result. I then converted the model and when I used it with mycroft-core it did not give same result. I got very few activation and that too with too much stress on the wake word. I played around with the threshold value, multiplier as well as energy ratio. This did not make much difference. I believe the model is now trained good, but to use it with mycroft-core the configuration of the core needs more tuning. Is there any recommendation i can try out here.

MatthewScholefield commented 5 years ago

The problem might be that there's been a change with the audio processing library Precise uses. Mycroft Core I think is still using the old one. Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs before training a new model. You can also try making that code change and deleting the my-copied-model.net.params and retraining it since the audio inputs will be similar, just slightly different.

tech-novic commented 5 years ago

Thanks Matthew, I will try as suggested and share results here.

Edit: Quick Update: I made changed to the code and retrained the model. Tried it with mycorft-core on my laptop with a head set it worked as charm. Tomorrow i will try with mic array and share the update

el-tocino commented 5 years ago

I retrained my models with this setting as well. They definitely feel more correctly responsive. The training took a few more steps to get where I liked it.

tech-novic commented 5 years ago

Hi Matthew, I tested with mic array today and it is working good. Thanks for all your support.

MatthewScholefield commented 5 years ago

Awesome to hear! Closing, but let me know if you have any other issues.

EuphoriaCelestial commented 3 years ago

precise-add-noise

@MatthewScholefield I dont find any usage of "precise-add-noise" in training tutorial is it removed?

MatthewScholefield commented 3 years ago

@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.

EuphoriaCelestial commented 3 years ago

@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.

yeah I've never known about that command until I reached this issue. Can you provide more information on what it does and how to use in training progress?

MatthewScholefield commented 3 years ago

@EuphoriaCelestial Is there any part you'd like me to expand on? What I explained from before covers most of it:

You can always run the command with --help to check how to use it:

$ precise-add-noise --help
usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR]
                         [-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH]
                         folder noise_folder output_folder

Create a duplicate dataset with added noise

positional arguments:
  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

optional arguments:
  -h, --help            show this help message and exit
  -tg TAGS_FILE, --tags-file TAGS_FILE
                        Tags file to optionally load from. Default: -
  -if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR
                        The number of noisy samples generated per single
                        source sample. Default: 1
  -nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW
                        Minimum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.0
  -nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH
                        Maximum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.4

As you can see, the 3 required arguments, in order, are:

  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in precise-train to this new, duplicated dataset.

This would be most useful to do in cases where you don't have a lot of wakewords and want to generate more variations of the data.

EuphoriaCelestial commented 3 years ago

Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs before training a new model.

@MatthewScholefield should I do this?

MatthewScholefield commented 3 years ago

@EuphoriaCelestial First, just to clarify, this only pertains to Mycroft Core. Now, in order to make it work with Mycroft Core, I think it's actually more realiable (but still a bit hacky) to basically do a source install of precise on the same platform you use mycroft core on. Then you can just link the source install engine script to where mycroft core expects it:

default_engine_path=~/.mycroft/precise/precise-engine/precise-engine

# Back up default precise-engine
mv "$default_engine_path" "$default_engine_path.bak"

# Link source install to Mycroft Core
cd mycroft-precise/  # Source install location
ln -s "$(pwd)/.venv/bin/precise-engine" "$default_engine_path"

If you do this you would definitely not need to modify the vectorizer.

EuphoriaCelestial commented 3 years ago

@MatthewScholefield I encountered this error when run precise-add-noise : WARNING: Found 676 wavs but no tags file specified! Data: <TrainData wake_words=0 not_wake_words=0 test_wake_words=0 test_not_wake_words=0> Done!

I dont know what happened, just yesterday it still working fine, I generated hundreds of file using this command now with the same command, same PC, same environment, everything just dont work anymore, even with old files, which I successfully added noise before; really confusing I tried adding tag using VLC but it doesnt work