Open minghau opened 5 years ago
Thanks for the bug report, a difference in accuracy can be the result of not using the exact same featurizer at runtime that you used to featurize the training data. Can you include the command line you used with "make_featurizer" and "make_dataset", it also looks like your test_ell_model.py command line got truncated in your output above. I just ran my own test with the attached scripts and it seems to work fine, with a the final pass rate of 92.27 %
Thank you so much for your prompt reply. Indeed, there were probably issues regarding the folders I used, since I compiled several times. Anyway, for future reference this is the script for Linux:
python tools/utilities/pythonlibs/audio/training/make_featurizer.py --sample_rate 16000 --window_size 400 --input_buffer_size 160 --nfft 512 --filterbank_type mel --filterbank_$
python tools/wrap/wrap.py --model_file featurizer.ell --module_name mfcc --outdir compiled_featurizer
python tools/utilities/pythonlibs/audio/training/make_dataset.py --outdir compiled_featurizer --categories categories.txt --featurizer compiled_featurizer/mfcc --window_size 10$
python tools/utilities/pythonlibs/audio/training/make_dataset.py --outdir compiled_featurizer --categories categories.txt --featurizer compiled_featurizer/mfcc --window_size 10$
python tools/utilities/pythonlibs/audio/training/make_dataset.py --outdir compiled_featurizer --categories categories.txt --featurizer compiled_featurizer/mfcc --window_size 10$
python tools/utilities/pythonlibs/audio/training/train_classifier.py --architecture GRU --epochs 30 --num_layers 2 --hidden_units 128 --use_gpu --dataset compiled_featurizer --$
python tools/importers/onnx/onnx_import.py GRU128KeywordSpotter.onnx
python tools/wrap/wrap.py --model_file GRU128KeywordSpotter.ell --outdir KeywordSpotter --module_name model
python tools/utilities/pythonlibs/audio/training/test_ell_model.py --featurizer compiled_featurizer/mfcc --classifier KeywordSpotter/model --list_file audio/testing_list.txt -$
BTW, any suggestion regarding training for Chinese? I saw that the number of audio samples per keyword is approximately 2000. Anything else I have to do except for collecting 30 words in Chinese?
Thanks !
Shouldn't matter which language the keywords are spoken in, could probably even do multiple languages if you had enough recordings, but of course, put each language in it's own folder, don't mix them. But yes, the key to any deep neural network is lots and lots of clean data. You can increase the size of the dataset by mixing low volume random noise background sounds. make_dataset.py has some options along these lines. Happy training!
Also your call to test_ell_model.py
might be missing the --auto_scale
option - depending on whether you specified this in the calle to make_featurizer.
Hello, I followed all the instructions (put aside the fact that the tutorials are not updated), trained a GRU model for 30 and 150 epochs from scratch, however, during the test phase the accuracy is around 10%.
Sample output is as follows:
During training the accuracy is good:
The script I used is:
Thanks