Closed tech-novic closed 5 years ago
I have indeed noticed a dependency on the mic used. This could be both because of the different noise frequencies it picks up, and because of the different volumes. Apart from recording samples on a few different mics, you can try recording a long audio of silence on a few other mics (almost like getting the "noise profile") and using precise-add-noise
to create a dataset with the noise of all the mics.
However, this might not solve the volume issue. Thinking about it, I should probably add a feature to precise-add-noise
to randomly vary the output volume.
Edit: And you can train the same model on a new dataset just like you would expect (passing the name of the model in the command, but passing in the folder with the new dataset).
Let me know if this helps.
Thanks Matthew, i will try first with collecting noise from different mic and also plan to collect data from different mic sources. I will keep you posted on the outcome.
Have one question on precise-add-noise, will the syntax be precise-add-noise path to folder containing wake word dataset path to folder containing noise data set path to folder to write output
After this I assume I should use precise-train
You can always run the command with --help
to check how to use it:
$ precise-add-noise --help
usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR]
[-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH]
folder noise_folder output_folder
Create a duplicate dataset with added noise
positional arguments:
folder Folder containing source dataset
noise_folder Folder with wav files containing noise to be added
output_folder Folder to write the duplicate generated dataset
optional arguments:
-h, --help show this help message and exit
-tg TAGS_FILE, --tags-file TAGS_FILE
Tags file to optionally load from. Default: -
-if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR
The number of noisy samples generated per single
source sample. Default: 1
-nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW
Minimum random ratio of noise to sample. 1.0 is all
noise, no sample sound. Default: 0.0
-nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH
Maximum random ratio of noise to sample. 1.0 is all
noise, no sample sound. Default: 0.4
As you can see, the 3 required arguments, in order, are:
folder Folder containing source dataset
noise_folder Folder with wav files containing noise to be added
output_folder Folder to write the duplicate generated dataset
As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in precise-train
to this new, duplicated dataset.
Hi Matthew, Update from using precise-add-noise. I collected noise sample as recommended from different mic and generated output wav file. Then i did training and when i tested the model with precise-listen i got lot of false activation. Then I did a incremental training with data\random (method 2). This resolved the false activation and when i tested with precise-listen using different mic it gave good result. I then converted the model and when I used it with mycroft-core it did not give same result. I got very few activation and that too with too much stress on the wake word. I played around with the threshold value, multiplier as well as energy ratio. This did not make much difference. I believe the model is now trained good, but to use it with mycroft-core the configuration of the core needs more tuning. Is there any recommendation i can try out here.
The problem might be that there's been a change with the audio processing library Precise uses. Mycroft Core I think is still using the old one. Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs
before training a new model. You can also try making that code change and deleting the my-copied-model.net.params
and retraining it since the audio inputs will be similar, just slightly different.
Thanks Matthew, I will try as suggested and share results here.
Edit: Quick Update: I made changed to the code and retrained the model. Tried it with mycorft-core on my laptop with a head set it worked as charm. Tomorrow i will try with mic array and share the update
I retrained my models with this setting as well. They definitely feel more correctly responsive. The training took a few more steps to get where I liked it.
Hi Matthew, I tested with mic array today and it is working good. Thanks for all your support.
Awesome to hear! Closing, but let me know if you have any other issues.
precise-add-noise
@MatthewScholefield I dont find any usage of "precise-add-noise" in training tutorial is it removed?
@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.
@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.
yeah I've never known about that command until I reached this issue. Can you provide more information on what it does and how to use in training progress?
@EuphoriaCelestial Is there any part you'd like me to expand on? What I explained from before covers most of it:
You can always run the command with
--help
to check how to use it:$ precise-add-noise --help usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR] [-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH] folder noise_folder output_folder Create a duplicate dataset with added noise positional arguments: folder Folder containing source dataset noise_folder Folder with wav files containing noise to be added output_folder Folder to write the duplicate generated dataset optional arguments: -h, --help show this help message and exit -tg TAGS_FILE, --tags-file TAGS_FILE Tags file to optionally load from. Default: - -if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR The number of noisy samples generated per single source sample. Default: 1 -nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW Minimum random ratio of noise to sample. 1.0 is all noise, no sample sound. Default: 0.0 -nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH Maximum random ratio of noise to sample. 1.0 is all noise, no sample sound. Default: 0.4
As you can see, the 3 required arguments, in order, are:
folder Folder containing source dataset noise_folder Folder with wav files containing noise to be added output_folder Folder to write the duplicate generated dataset
As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in
precise-train
to this new, duplicated dataset.
This would be most useful to do in cases where you don't have a lot of wakewords and want to generate more variations of the data.
Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs before training a new model.
@MatthewScholefield should I do this?
@EuphoriaCelestial First, just to clarify, this only pertains to Mycroft Core. Now, in order to make it work with Mycroft Core, I think it's actually more realiable (but still a bit hacky) to basically do a source install of precise on the same platform you use mycroft core on. Then you can just link the source install engine script to where mycroft core expects it:
default_engine_path=~/.mycroft/precise/precise-engine/precise-engine
# Back up default precise-engine
mv "$default_engine_path" "$default_engine_path.bak"
# Link source install to Mycroft Core
cd mycroft-precise/ # Source install location
ln -s "$(pwd)/.venv/bin/precise-engine" "$default_engine_path"
If you do this you would definitely not need to modify the vectorizer.
@MatthewScholefield I encountered this error when run precise-add-noise :
WARNING: Found 676 wavs but no tags file specified! Data: <TrainData wake_words=0 not_wake_words=0 test_wake_words=0 test_not_wake_words=0> Done!
I dont know what happened, just yesterday it still working fine, I generated hundreds of file using this command now with the same command, same PC, same environment, everything just dont work anymore, even with old files, which I successfully added noise before; really confusing I tried adding tag using VLC but it doesnt work
I trained model with 200 + dataset (about 15 sample from different individuals) usind method 2 to take out false negative. Now the model when tested works good with the mic i used to collect the dataset. When i use other mics then the result is different, when i use laptop mic there is lot of false positive when i use another mic then to get activation is very hard. I also exported the model to tensorflow and used it with 6 array mic and couldnt good activation. There are 2 questions here,