Closed chenting0324 closed 7 years ago
Hi, the error rate could in fact be higher on another dataset. The pre-trained model was trained over 3 different dataset simultaneously so it should be able to proceed on various data but there is always a better performance on test datasets corresponding to the ones used for training. About the hidden vectors you can in fact get them directly from the tensorflow checkpoint file. You should try using : https://gist.github.com/batzner/7c24802dd9c5e15870b4b56e22135c96 against the checkpoint file. It's useful to list variables in a checkpoint and rename some of them to match your naming against your own model. I'm interesting in your WER and CER results against the other dataset if you still have it. I'm trying to obtain a better result by fine-tuning hyperparameters but don't get any significant improvement for the moment.
Hi, I will use the model on other dataset after some days, I will give you WER and CER results then. Thank you for your answer. I will have a try, Is the tensorflow checkpoint file in your trained models/acoustic_model/english/checkpoint or mean other file? Thank you very much!!!
Hello, I use :+1: https://github.com/tensorflow/tensorflow/blob/r0.10/tensorflow/python/tools/inspect_checkpoint.py and I got the result as follows:
tensor_name: rnn/multi_rnn_cell/cell_2/basic_lstm_cell/biases [-0.0313677 -0.00658913 -0.01896443 ..., -0.00783602 0.00158059 0.00172835] tensor_name: rnn/multi_rnn_cell/cell_1/basic_lstm_cell/biases [ 0.0001673 -0.05007814 0.00368076 ..., -0.01218077 0.01428195 0.02477961] tensor_name: rnn/multi_rnn_cell/cell_0/basic_lstm_cell/biases [-0.00722365 -0.04710276 0.02127041 ..., -0.01892216 -0.06573219 -0.01893883] tensor_name: global_step 9500 tensor_name: Input_Layer/input_b [-0.11696832 0.00067524 -0.01014187 ..., -0.09762439 0.04245738 0.01483767] tensor_name: learning_rate 3.267e-05 tensor_name: rnn/multi_rnn_cell/cell_0/basic_lstm_cell/weights [[-0.01136321 -0.00384914 -0.04242354 ..., -0.01532782 0.06549271 0.00346065] [-0.05127605 -0.02392749 -0.01478572 ..., -0.01351318 0.04717604 0.05263193] [ 0.0404567 -0.01764796 0.0138403 ..., -0.01546826 0.01645955 0.00134034] ..., [-0.04345571 -0.05136441 -0.03100013 ..., 0.0047017 -0.0306399 0.03517035] [-0.00018307 0.0137899 0.00612506 ..., 0.03813621 -0.05160636 -0.0202991 ] [-0.00455729 0.03632646 -0.01809452 ..., -0.00676679 0.00376569 0.10562224]] tensor_name: Input_Layer/input_w [[-0.03182084 -0.01579602 0.00075297 ..., 0.02241369 -0.00096865 0.01190329] [-0.0249869 0.01131274 -0.0337288 ..., 0.00583539 0.00882952 -0.03973912] [-0.02620839 -0.0410029 0.00613309 ..., 0.00256835 -0.02844046 0.02862929] ..., [-0.03583897 0.00582753 -0.07601061 ..., -0.03153202 -0.00913378 0.05261716] [-0.02297934 0.04844452 -0.02955636 ..., 0.02908967 0.00248094 0.01559274] [ 0.01304239 0.04316467 -0.07282739 ..., 0.01061043 0.0257307 0.11344201]] tensor_name: Output_layer/output_w [[ 0.05101303 0.18263206 0.18622988 ..., -0.03892048 -0.06715754 -0.06261685] [ 0.00886919 0.04810521 -0.17560247 ..., -0.21467008 0.05993012 0.0771218 ] [ 0.03629687 -0.13104998 -0.08590779 ..., -0.08579271 0.05087085 0.05212956] ..., [-0.06602976 0.13366954 0.03888952 ..., -0.06018423 -0.02374146 0.00839281] [ 0.09960367 0.17472173 0.08439461 ..., 0.15567915 0.0323404 -0.03471686] [ 0.1105698 0.18649958 -0.00129581 ..., 0.16154449 0.03485731 -0.17772995]] tensor_name: rnn/multi_rnn_cell/cell_2/basic_lstm_cell/weights [[-0.0356952 0.01836566 -0.00732613 ..., 0.046164 -0.06747929 0.05385 ] [ 0.08403415 0.03950982 0.00801035 ..., 0.02327672 0.01805933 -0.0331181 ] [ 0.0084149 -0.02517631 0.00857453 ..., -0.05464114 0.0043622 0.03270212] ..., [-0.0187362 -0.07235921 0.06286826 ..., -0.01012454 0.02534539 0.02963923] [ 0.07454605 0.031953 -0.04824256 ..., 0.02892545 -0.01999683 0.01131981] [ 0.01235672 0.02575596 -0.03723545 ..., -0.0870229 -0.04768194 -0.15054134]] tensor_name: rnn/multi_rnn_cell/cell_1/basic_lstm_cell/weights [[ 0.05117989 -0.05484957 0.00072761 ..., 0.07506247 -0.0041365 0.00778818] [ 0.08750629 -0.01551697 0.00817819 ..., 0.04885173 -0.03843196 -0.04395888] [-0.0075606 0.01272487 0.04914607 ..., 0.04965738 0.01223274 0.01021633] ..., [ 0.00674141 0.00118298 0.03366723 ..., -0.02298987 -0.0515626 0.03328852] [-0.07422537 -0.04096507 -0.00999226 ..., 0.02797294 -0.02184403 -0.05488605] [ 0.03184707 -0.06398494 -0.05414595 ..., -0.04157395 0.04862571 0.00409058]]
but now I'm confused that what are the hidden vectors?Can you help me? Thanks a lot!!!
Hi,
The checkpoint file and the 3 acousticmodel.ckpt.*
files are the 4 files used by tensorflow to save a checkpoint.
In the list you obtain you have multiples tensor :
So if you want to use the network you should use every variables except the learning rate and the global_step counter. You can also use only a part of those variables as a starter for a larger network. I don't know if it would accelerate the training... maybe...
Note that output_b is missing in the checkpoint because of a bug in the saving method. This is fixed in the dev branch but the actual pre-trained network won't load with the software in that branch because of it.
Hi, Thank you for your detailed answer.In fact,I only want to extract the hidden vectors from the rnn,but in my list,it doesn't include the hidden vectors,right? the code:self.saver = tf.train.Saver(save_list) will save all variables, but hidden vectors didn't be saved,right? I want to use the hidden vectors in otherwhere,So can I get them from the tensorflow checkpoint file?but I didn't see them in the checkpoint file. (So sad!!!) Thank you!!!!
Maybe the hidden vectors are defined in this code: self.hidden_state = tf.Variable(tf.zeros((num_layers, 2, batch_size, hidden_size)), trainable=False)?
ok, I didn't understand it before, sorry. The hidden vector is in fact stored into the self.hidden_state variable. It's not saved into the checkpoint but your could have it with a minor change, just add it's name in the definition of save_list in AcousticModel.py :
save_list = [var for var in tf.global_variables() if (var.name.find('/input_w:0') != -1) or (var.name.find('/input_b:0') != -1) or (var.name.find('/output_w:0') != -1) or (var.name.find('/output_w:0') != -1) or (var.name.find('global_step:0') != -1) or (var.name.find('learning_rate:0') != -1) or (var.name.find('/weights:0') != -1) or (var.name.find('/biases:0') != -1)]
You can get the name by looking into the tensorboard's graph.
Thank you very much! I will have a try!!!
Hi, I added the self.hidden_state in the save_list,just as follows: save_list = [var for var in tf.global_variables() if (var.name.find('/input_w:0') != -1) or (var.name.find('/input_b:0') != -1) or (var.name.find('/output_w:0') != -1) or (var.name.find('/output_w:0') != -1) or (var.name.find('global_step:0') != -1) or (var.name.find('learning_rate:0') != -1) or (var.name.find('/weights:0') != -1) or (var.name.find('/biases:0') != -1) or (var.name.find('/self.hidden_state:0') != -1)] but it didn't work, the result is same as before: All Variables: b'Input_Layer/input_b (DT_FLOAT) [1024]\nInput_Layer/input_w (DT_FLOAT) [120,1024]\nOutput_layer/output_w (DT_FLOAT) [1024,80]\nglobal_step (DT_INT32) []\nlearning_rate (DT_FLOAT) []\nrnn/multi_rnn_cell/cell_0/basic_lstm_cell/biases (DT_FLOAT) [4096]\nrnn/multi_rnn_cell/cell_0/basic_lstm_cell/weights (DT_FLOAT) [2048,4096]\nrnn/multi_rnn_cell/cell_1/basic_lstm_cell/biases (DT_FLOAT) [4096]\nrnn/multi_rnn_cell/cell_1/basic_lstm_cell/weights (DT_FLOAT) [2048,4096]\nrnn/multi_rnn_cell/cell_2/basic_lstm_cell/biases (DT_FLOAT) [4096]\nrnn/multi_rnn_cell/cell_2/basic_lstm_cell/weights (DT_FLOAT) [2048,4096]\n' it didn't have th variable"self.hidden_state",it's not saved into the checkpoint!!howCan I solve this problem? Thank you very much!!!
Hi,
The name of the variable you have to add in the save_list is the tensorflow name. For this variable it's "Hidden_state/hidden_state:0", so you should put :
(var.name.find('/hidden_state:0') != -1)
and it should work.
If you want you can also get rid of the if statement in the list construction, this way all tensorflow's variables will be saved in the checkpoint file, including the hidden_state.
Hi,
I added it to the save_list as follows:
save_list = [var for var in tf.global_variables()
if (var.name.find('/input_w:0') != -1) or (var.name.find('/input_b:0') != -1) or
(var.name.find('/output_w:0') != -1) or (var.name.find('/output_w:0') != -1) or
(var.name.find('global_step:0') != -1) or (var.name.find('learning_rate:0') != -1) or
(var.name.find('/weights:0') != -1) or (var.name.find('/biases:0') != -1) or (var.name.find('/hidden_state:0') != -1)]
andthe codes that I get variables from checkpoint file are as follows:
from tensorflow.python import pywrap_tensorflow
import os
checkpoint_dir="trained_models/acoustic_model/english"
checkpoint_path = os.path.join(checkpoint_dir, "acousticmodel.ckpt")
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()
for key in var_to_shape_map:
print("tensor_name: ", key)
print(reader.get_tensor(key))
and the result is: tensor_name: rnn/multi_rnn_cell/cell_1/basic_lstm_cell/biases [ 0.0001673 -0.05007814 0.00368076 ..., -0.01218077 0.01428195 0.02477961] tensor_name: rnn/multi_rnn_cell/cell_0/basic_lstm_cell/weights [[-0.01136321 -0.00384914 -0.04242354 ..., -0.01532782 0.06549271 0.00346065] [-0.05127605 -0.02392749 -0.01478572 ..., -0.01351318 0.04717604 0.05263193] [ 0.0404567 -0.01764796 0.0138403 ..., -0.01546826 0.01645955 0.00134034] ..., [-0.04345571 -0.05136441 -0.03100013 ..., 0.0047017 -0.0306399 0.03517035] [-0.00018307 0.0137899 0.00612506 ..., 0.03813621 -0.05160636 -0.0202991 ] [-0.00455729 0.03632646 -0.01809452 ..., -0.00676679 0.00376569 0.10562224]] tensor_name: Input_Layer/input_w [[-0.03182084 -0.01579602 0.00075297 ..., 0.02241369 -0.00096865 0.01190329] [-0.0249869 0.01131274 -0.0337288 ..., 0.00583539 0.00882952 -0.03973912] [-0.02620839 -0.0410029 0.00613309 ..., 0.00256835 -0.02844046 0.02862929] ..., [-0.03583897 0.00582753 -0.07601061 ..., -0.03153202 -0.00913378 0.05261716] [-0.02297934 0.04844452 -0.02955636 ..., 0.02908967 0.00248094 0.01559274] [ 0.01304239 0.04316467 -0.07282739 ..., 0.01061043 0.0257307 0.11344201]] tensor_name: global_step 9500 tensor_name: rnn/multi_rnn_cell/cell_1/basic_lstm_cell/weights [[ 0.05117989 -0.05484957 0.00072761 ..., 0.07506247 -0.0041365 0.00778818] [ 0.08750629 -0.01551697 0.00817819 ..., 0.04885173 -0.03843196 -0.04395888] [-0.0075606 0.01272487 0.04914607 ..., 0.04965738 0.01223274 0.01021633] ..., [ 0.00674141 0.00118298 0.03366723 ..., -0.02298987 -0.0515626 0.03328852] [-0.07422537 -0.04096507 -0.00999226 ..., 0.02797294 -0.02184403 -0.05488605] [ 0.03184707 -0.06398494 -0.05414595 ..., -0.04157395 0.04862571 0.00409058]] tensor_name: rnn/multi_rnn_cell/cell_2/basic_lstm_cell/weights [[-0.0356952 0.01836566 -0.00732613 ..., 0.046164 -0.06747929 0.05385 ] [ 0.08403415 0.03950982 0.00801035 ..., 0.02327672 0.01805933 -0.0331181 ] [ 0.0084149 -0.02517631 0.00857453 ..., -0.05464114 0.0043622 0.03270212] ..., [-0.0187362 -0.07235921 0.06286826 ..., -0.01012454 0.02534539 0.02963923] [ 0.07454605 0.031953 -0.04824256 ..., 0.02892545 -0.01999683 0.01131981] [ 0.01235672 0.02575596 -0.03723545 ..., -0.0870229 -0.04768194 -0.15054134]] tensor_name: rnn/multi_rnn_cell/cell_2/basic_lstm_cell/biases [-0.0313677 -0.00658913 -0.01896443 ..., -0.00783602 0.00158059 0.00172835] tensor_name: Output_layer/output_w [[ 0.05101303 0.18263206 0.18622988 ..., -0.03892048 -0.06715754 -0.06261685] [ 0.00886919 0.04810521 -0.17560247 ..., -0.21467008 0.05993012 0.0771218 ] [ 0.03629687 -0.13104998 -0.08590779 ..., -0.08579271 0.05087085 0.05212956] ..., [-0.06602976 0.13366954 0.03888952 ..., -0.06018423 -0.02374146 0.00839281] [ 0.09960367 0.17472173 0.08439461 ..., 0.15567915 0.0323404 -0.03471686] [ 0.1105698 0.18649958 -0.00129581 ..., 0.16154449 0.03485731 -0.17772995]] tensor_name: rnn/multi_rnn_cell/cell_0/basic_lstm_cell/biases [-0.00722365 -0.04710276 0.02127041 ..., -0.01892216 -0.06573219 -0.01893883] tensor_name: Input_Layer/input_b [-0.11696832 0.00067524 -0.01014187 ..., -0.09762439 0.04245738 0.01483767] tensor_name: learning_rate 3.267e-05
it still doesn't include the hidden_state, is there anywhere need to be modified? Thank you very much!!!(I feel sorry for my bother)
The hidden_state is not currently in the checkpoint file you are using. The modification done to the save_list is just to allow it to be saved when creating a new checkpoint. Now you have to train the network to have a checkpoint file containing the hidden_state. Are you sure that you are looking for the hidden_state ? It's not very useful, it only keep the state of the network from one slice of audio sample to the next. At the end of the training the content of the hidden state is directly inherited from the last files processed, which can be any files.
Hi,I want to use the hidden vectors to train word2vec ,it's initial vectors are initialized randomly,I want to use the hidden vectors to initialize, So I need to extract the hidden vectors from hidden_state,the hidden_state include hidden vectors ,right?maybe is there some other good ways?I'm a little bit confused! Thank you very much!
You mean I have to train the network that is I should use“python stt.py --train” instead of “python stt.py --file”? Thanks a lot!
Hi, yes, for training you will need to launch “python stt.py --train” but you also need to have training data in the data directory and a corresponding config.ini file. About using the hidden_state for the word2vec network beware that the AcousticModel use a frame of size 0.025 s and move only of 0.01s between each steps, so there is an overlap between each steps.
That sounds a little difficult, I will have a try,Thank you!
Closing, please reopen if you have any other related questions.
Hello!I used the model on other dataset, so it's word error rate is a little bit high, does it a normal phenomenon. In addittion,I want to know How Can I get the hidden vectors? I want to use these hidden vectors in otherwhere. Thank you very much!