Closed mr-coconut closed 2 years ago
First,
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_embedding/embeddings:0, shape = (1019, 640), *INIT_FROM_CKPT*
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/kernel:0, shape = (640, 2560)
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/recurrent_kernel:0, shape = (640, 2560)
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/bias:0, shape = (2560,)
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_ln_0/gamma:0, shape = (640,), *INIT_FROM_CKPT*
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_ln_0/beta:0, shape = (640,), *INIT_FROM_CKPT*
This only load language model parameters, to load entire parameters, simply remove 2nd line,
variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
init_checkpoint = 'output-large-singlish-conformer_copy/model.ckpt'
I validated the checkpoint using colab, https://colab.research.google.com/drive/1IP_GtVUAIJVDTv5C60RM9tKjdzHSO3zI?usp=sharing
Thanks for the reply! I will look through this.
First,
INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_embedding/embeddings:0, shape = (1019, 640), *INIT_FROM_CKPT* INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/kernel:0, shape = (640, 2560) INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/recurrent_kernel:0, shape = (640, 2560) INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_lstm_0/lstm_cell/bias:0, shape = (2560,) INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_ln_0/gamma:0, shape = (640,), *INIT_FROM_CKPT* INFO:tensorflow: name = transducer/transducer_prediction/transducer_prediction_ln_0/beta:0, shape = (640,), *INIT_FROM_CKPT*
This only load language model parameters, to load entire parameters, simply remove 2nd line,
variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) init_checkpoint = 'output-large-singlish-conformer_copy/model.ckpt'
I validated the checkpoint using colab, https://colab.research.google.com/drive/1IP_GtVUAIJVDTv5C60RM9tKjdzHSO3zI?usp=sharing
Hi, Thanks for your reply again. But I get confused in "This only load language model parameters" part. I copied these lines from https://github.com/huseinzol05/malaya-speech/blob/master/pretrained-model/stt/conformer/base.py#L340 to try to load from a previous checkpoint. I think it should make sense just to retrain the language model parameters using new data while keeping other parameters unchanged. I am still thinking about why the loss is around 800, which shouldn't be the case since I used your pretrained checkpoint. One possible explanation is that I didn't load checkpoint successfully. Thanks!
you might want to compare the accuracy between finetune entire model vs finetune LM only. Did you compiled warp-transducer? you need to test warp-transducer result first using provided unit tests, https://github.com/huseinzol05/malaya-speech/blob/master/scripts/build-rnnt-3090.sh
20
Hi, thanks for the reply to the last issue. Now I can make the training script run, but the loss seems to be very high. I will attach the code and result below:
The training result is shown as below:
Since the loss is very high, I also tried to load from <output-large-singlish-conformer/model.ckpt> and the result is something like below:
Sorry for such a long issue. Appreciate it if you can give it a look. Thanks