Open gamnes opened 6 years ago
@gamnes
after train your own model, then you can try test.py , after test.py, use fold_batchnorm.py to get *_bnfused file. then do quant_test.py , you will get the almost scores like test.py.
--------here is my command for your reference -------------
python test.py \ --data_url= \ --data_dir=/DataSet/ASR/speech_commands.v0.02 \ --model_architecture ds_cnn \ --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --dct_coefficient_count 10 \ --window_size_ms 40 \ --window_stride_ms 20 \ --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800
python fold_batchnorm.py \ --model_architecture ds_cnn \ --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --dct_coefficient_count 10 \ --window_size_ms 40 \ --window_stride_ms 20 \ --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800 \
python quant_test.py \ --data_url= \ --data_dir=/DataSet/ASR/speech_commands.v0.02 \ --model_architecture=ds_cnn \ --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --dct_coefficient_count 10 \ --window_size_ms 40 \ --window_stride_ms 20 \ --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800_bnfused \ --act_max 32 0 0 0 0 0 0 0 0 0 0 0
Hi @yanyanem, thanks for the information!:) I haven't had time to try this yet, but will this give me the header files I need as well? I.e.: Does quant_test.py produce the header files? - or is there another step involved to get them (perhaps you just extract the numbers from the console or something)?
@gamnes quant_test.py will generate the weights.h file under folder ML-KWS-for-MCU. then you can cp this file to replace ds_cnn_weights.h under \Deployment\Source\NN\DS_CNN. NOTE: you must change the each layer names in weights.h same as ds_cnn_weights.h
newHeader DS_CNN_conv_1_weights_0 newHeader DS_CNN_conv_1_biases_0 newHeader DS_CNN_conv_ds_1_dw_conv_depthwise_weights_0 newHeader DS_CNN_conv_ds_1_dw_conv_biases_0 newHeader DS_CNN_conv_ds_1_pw_conv_weights_0 newHeader DS_CNN_conv_ds_1_pw_conv_biases_0 newHeader DS_CNN_conv_ds_2_dw_conv_depthwise_weights_0 newHeader DS_CNN_conv_ds_2_dw_conv_biases_0 newHeader DS_CNN_conv_ds_2_pw_conv_weights_0 newHeader DS_CNN_conv_ds_2_pw_conv_biases_0 newHeader DS_CNN_conv_ds_3_dw_conv_depthwise_weights_0 newHeader DS_CNN_conv_ds_3_dw_conv_biases_0 newHeader DS_CNN_conv_ds_3_pw_conv_weights_0 newHeader DS_CNN_conv_ds_3_pw_conv_biases_0 newHeader DS_CNN_conv_ds_4_dw_conv_depthwise_weights_0 newHeader DS_CNN_conv_ds_4_dw_conv_biases_0 newHeader DS_CNN_conv_ds_4_pw_conv_weights_0 newHeader DS_CNN_conv_ds_4_pw_conv_biases_0 newHeader DS_CNN_fc1_weights_0 newHeader DS_CNN_fc1_biases_0
oldHeader CONV1_WT oldHeader CONV1_BIAS oldHeader CONV2_DS_WT oldHeader CONV2_DS_BIAS oldHeader CONV2_PW_WT oldHeader CONV2_PW_BIAS oldHeader CONV3_DS_WT oldHeader CONV3_DS_BIAS oldHeader CONV3_PW_WT oldHeader CONV3_PW_BIAS oldHeader CONV4_DS_WT oldHeader CONV4_DS_BIAS oldHeader CONV4_PW_WT oldHeader CONV4_PW_BIAS oldHeader CONV5_DS_WT oldHeader CONV5_DS_BIAS oldHeader CONV5_PW_WT oldHeader CONV5_PW_BIAS oldHeader FINAL_FC_WT oldHeader FINAL_FC_BIAS
@yanyanem I am trying to execute for same. There is issue while executing quant_test.py. In that , I want to train only one keyword (stop).
Command is:
python quant_test.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --feature_bin_count 10 --window_size_ms 40 --window_stride_ms 40 --learning_rate 0.001 --checkpoint H:\tmp\pb_file1\speech_commands_train\ds_cnn.ckpt-3800_bnfused
and Error is:
H:\Anaconda\lib\site-packages\h5py__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
ground_truth_input
INFO:tensorflow:Restoring parameters from H:\tmp\pb_file1\speech_commands_train\ds_cnn.ckpt-3800_bnfused
label_count
DS-CNN_conv_1_weights_0 number of wts/bias: (10, 4, 1, 64) dec bits: 8 max: (0.25,0.24940382) min: (-0.28125,-0.27984563)
DS-CNN_conv_1_biases_0 number of wts/bias: (64,) dec bits: 8 max: (0.29296875,0.29396462) min: (-0.23828125,-0.23818439)
DS-CNN_conv_ds_1_dw_conv_depthwise_weights_0 number of wts/bias: (3, 3, 64, 1) dec bits: 5 max: (2.59375,2.5908468) min: (-2.59375,-2.5817773)
DS-CNN_conv_ds_1_dw_conv_biases_0 number of wts/bias: (64,) dec bits: 7 max: (0.9609375,0.957685) min: (-0.6953125,-0.696088)
DS-CNN_conv_ds_1_pw_conv_weights_0 number of wts/bias: (1, 1, 64, 64) dec bits: 7 max: (0.59375,0.59316313) min: (-0.609375,-0.6079887)
DS-CNN_conv_ds_1_pw_conv_biases_0 number of wts/bias: (64,) dec bits: 7 max: (0.7890625,0.7917288) min: (-0.734375,-0.7381487)
DS-CNN_conv_ds_2_dw_conv_depthwise_weights_0 number of wts/bias: (3, 3, 64, 1) dec bits: 5 max: (2.53125,2.5354643) min: (-2.09375,-2.1003373)
DS-CNN_conv_ds_2_dw_conv_biases_0 number of wts/bias: (64,) dec bits: 6 max: (1.296875,1.2918607) min: (-1.09375,-1.0935875)
DS-CNN_conv_ds_2_pw_conv_weights_0 number of wts/bias: (1, 1, 64, 64) dec bits: 7 max: (0.4921875,0.4945545) min: (-0.515625,-0.5150995)
DS-CNN_conv_ds_2_pw_conv_biases_0 number of wts/bias: (64,) dec bits: 7 max: (0.984375,0.9850131) min: (-0.984375,-0.9861221)
DS-CNN_conv_ds_3_dw_conv_depthwise_weights_0 number of wts/bias: (3, 3, 64, 1) dec bits: 6 max: (1.703125,1.697497) min: (-1.671875,-1.6774025)
DS-CNN_conv_ds_3_dw_conv_biases_0 number of wts/bias: (64,) dec bits: 6 max: (1.140625,1.1462464) min: (-1.25,-1.2437633)
DS-CNN_conv_ds_3_pw_conv_weights_0 number of wts/bias: (1, 1, 64, 64) dec bits: 8 max: (0.46875,0.46849197) min: (-0.46484375,-0.46592054)
DS-CNN_conv_ds_3_pw_conv_biases_0 number of wts/bias: (64,) dec bits: 6 max: (1.15625,1.1546066) min: (-0.875,-0.8820074)
DS-CNN_conv_ds_4_dw_conv_depthwise_weights_0 number of wts/bias: (3, 3, 64, 1) dec bits: 6 max: (1.71875,1.7205564) min: (-1.40625,-1.4096426)
DS-CNN_conv_ds_4_dw_conv_biases_0 number of wts/bias: (64,) dec bits: 6 max: (1.125,1.1232377) min: (-1.125,-1.1255203)
DS-CNN_conv_ds_4_pw_conv_weights_0 number of wts/bias: (1, 1, 64, 64) dec bits: 7 max: (0.5234375,0.5243965) min: (-0.5234375,-0.52615243)
DS-CNN_conv_ds_4_pw_conv_biases_0 number of wts/bias: (64,) dec bits: 6 max: (1.53125,1.5273341) min: (-1.1875,-1.1830426)
DS-CNN_fc1_weights_0 number of wts/bias: (64, 3) dec bits: 8 max: (0.390625,0.39195022) min: (-0.34375,-0.34377998)
DS-CNN_fc1_biases_0 number of wts/bias: (3,) dec bits: 8 max: (0.2890625,0.28785577) min: (-0.1796875,-0.1781917)
INFO:tensorflow:set_size=814
Traceback (most recent call last):
File "quant_test.py", line 320, in
'''ValueError: Cannot feed value of shape (100,) for Tensor 'groundtruth_input:0', which has shape '(?, 3)'''
This code generates "weight.h" but give this error '''ValueError: Cannot feed value of shape (100,) for Tensor 'groundtruth_input:0', which has shape '(?, 3)'''.
And this weight.h file replace by Ds_cnn_weight.h under \Deployment\Source\NN\DS_CNN. But only detect silence not a keyword.
Where i was wrong? and Which model is best for keyword spotting?
@vrushalibhokare
I am not sure what problem with your command, but i found those things different: I guess we have different code version of quant_test.py ?
-- here is my command for your reference. ----
python quant_test.py \ --data_url= \ --data_dir=/DataSet/ASR/speech_commands.v0.02 \ --model_architecture=ds_cnn \ --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --dct_coefficient_count 10 \ --window_size_ms 40 \ --window_stride_ms 20 \ --act_max 32 0 0 0 0 0 0 0 0 0 0 0 \ --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800
@yanyanem
I refer https://github.com/ARM-software/ML-KWS-for-MCU and https://www.tensorflow.org/tutorials/sequences/audio_recognition for Keyword spotting implementation.
but find this error,
'''ValueError: Cannot feed value of shape (100,) for Tensor 'groundtruth_input:0', which has shape '(?, 3)''''
python quant_test.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --feature_bin_count 10 --window_size_ms 40 --window_stride_ms 20 --learning_rate 0.001 --act_max 32 0 0 0 0 0 0 0 0 0 0 0 --checkpoint H:\tmp\pb_file1\speech_commands_train\ds_cnn.ckpt-3800
How can i train other word excluding speech_commands_word? Is there procedure remains same? Where i missed any step for getting weight.h ?
@vrushalibhokare
i think we have different code. i get all code from ML-KWS-for-MCU , not from tensorflow/example/speech_command/
i git clone this github code then run ---
i put the dataset into /DataSet/ASR/speech_commands.v0.02 after train.py, i can get the ds_cnn_9118.ckpt-4800 in /Result/speech_commands_train/best
python train.py \ --data_url= \ --data_dir=/DataSet/ASR/speech_commands.v0.02 \ --model_architecture ds_cnn \ --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --dct_coefficient_count 10 \ --window_size_ms 40 \ --window_stride_ms 20 \ --learning_rate 0.0005,0.0001,0.00002 \ --how_many_training_steps 10000,10000,10000 \ --train_dir=/Result/speech_commands_train \ --summaries_dir=/Result/retrain_logs
python test.py --data_url= --data_dir=/DataSet/ASR/speech_commands.v0.02 --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --dct_coefficient_count 10 --window_size_ms 40 --window_stride_ms 20 --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800
python fold_batchnorm.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --dct_coefficient_count 10 --window_size_ms 40 --window_stride_ms 20 --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800 \
python quant_test.py --data_url= --data_dir=/DataSet/ASR/speech_commands.v0.02 --model_architecture=ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --dct_coefficient_count 10 --window_size_ms 40 --window_stride_ms 20 --checkpoint /Result/speech_commands_train/best_ds_cnn_S/ds_cnn_9118.ckpt-4800_bnfused --act_max 32 0 0 0 0 0 0 0 0 0 0 0
@vrushalibhokare
'''ValueError: Cannot feed value of shape (100,) for Tensor 'groundtruth_input:0', which has shape '(?, 3)''''
i guess you should check the label_count, because different labels should 12 . and maybe you only have 3 . see code in quant_test.py ground_truth_input = tf.placeholder( tf.float32, [None, label_count], name='groundtruth_input') maybe in your train.py you use label_count=3, but in quant_test.py maybe label_count=12 ?
by the way, even this error , weights.h should generate also well, because this error only for testing acc.
How can i train other word excluding speech_commands_word?
-- yes, you can put your own voices in speech_commands_word dataset with same folder structure as yes,no,up.... folder.
Is there procedure remains same? -- yes, you sould add --wanted_words parameter to point your own words (example newWord1,newWord2, yes, no, ... etc ) and the best experience is that you still keep 10 words in this so you do not need to change any code in Deployment.
Where i missed any step for getting weight.h ? -- no, i think it is enough. 1. train.py. 2. fold_batchnorm.py 3. quant_test.py , you will see the weights.h
@yanyanem
I got weight.h . For Simple KWS test, i have done changes in main.cpp
''' char_output_classes[3][8]={"silence","unknown","stop"} '''
for stop audio buffer , got detection for silence(99%).
For real time also got same detection (silence--max_ind=0)
where i was wrong?
@vrushalibhokare
@yanyanem.
I am trying to generate weight.h for DNN model . I was successfully generate .PB file . But in Pb file, for first layer (Fc1 layer) ....got error at the time of freezing. Error is: ''' Assign requires shapes of both tensors to match. lhs shape= [3920,144] rhs shape= [490,144] [[node save/Assign_1 (defined at H:\ML-KWS-for-MCU-master\models.py:159) = Assign[T=DT_FLOAT, _class=["loc:@fc1/W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](fc1/W, save/RestoreV2:1)]] '''
How to reduce lenght of Fc1 layer in Dnn model ?
@vrushalibhokare
what's your training command to generate .PB with DNN model. you can use these parameters for the dim of DNN model.
--dct_coefficient_count 10 --window_size_ms 40 --window_stride_ms 40
From: vrushalibhokare Date: 2019-01-07 14:25 To: ARM-software/ML-KWS-for-MCU CC: Sun Jin; Mention Subject: Re: [ARM-software/ML-KWS-for-MCU] Generation of ds_cnn_weights.h (#62) @yanyanem. I am trying to generate weight.h for DNN model . I was successfully generate .PB file . But in Pb file, for first layer (Fc1 layer) ....got error at the time of freezing. Error is: ''' Assign requires shapes of both tensors to match. lhs shape= [3920,144] rhs shape= [490,144] [[node save/Assign_1 (defined at H:\ML-KWS-for-MCU-master\models.py:159) = Assign[T=DT_FLOAT, _class=["loc:@fc1/W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](fc1/W, save/RestoreV2:1)]] ''' How to reduce lenght of Fc1 layer in Dnn model ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Answer to the question of @vrushalibhokare : If anyone is stuck with the same error this could be caused when using special parameters in the training script (for example --dst_coefficient_count 20 --window_size_ms 20) and not using the same parameters in the other scripts. Using the same sizes in all scripts should lead to correct behaviour.
Hi, Could you share the commands/python calls used to generate the ds_cnn_weights.h file in \Deployment\Source\NN\DS_CNN? :)
I am assuming one can use one of the pretrained models, either DS_CNN_L.pb/M/or S, in \Pretrained_models\DS_CNN to get the values of this header file?
I was looking at this page: https://github.com/ARM-software/ML-KWS-for-MCU/blob/master/Deployment/Quant_guide.md, but since the description is about DNN, I think I'm not understanding the ACT_MAX settings, when calling "python quant_test.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 --dct_coefficient_count 10 --window_size_ms 40 --window_stride_ms 20 --checkpoint work\DS_CNN\DS_CNN1\training\best\ds_cnn_8479.ckpt-2400 --act_max ???? {what does here?}
From this issue: https://github.com/ARM-software/ML-KWS-for-MCU/issues/53, it looks like one would need 12 act_max values. Can you provide the values used to generate the values in ds_cnn_weights.h?
I also see fold_batchnorm.py, but not sure if that is required or not to get these values?
Thanks!