Use pre-trained Librispeech on Jasper to another dataset

Hello everyone, I intend to train a dataset about tax and finance around 1.5G on Jasper model. I would like to use pre-trained Librispeech to continue training with n-gram language model and beam search to save time and gain efforts. As the advised in #470, all the dataset was preprocessed to around (10-24s) with the transcripts, the learning rate is decreased, the number of epoch is increased... I tried to do all the possible tuning hyperparameters to improve the model's performances. I also replace training_params, eval_params, and infer_params by my training, dev, and test files.

Finally, model worked well on training, but the validation was very poor. Although I tested the validation on pretrained before, it gaves 0.34, better than after training my dataset (around 0.97).

I think my training changed the pre-trained checkpoints, so I trained on my dataset + Librispeech and validated on my dev file + Librispeech's dev files, it began to train from 3th epoch and took a lot of time each epoch (on 2 GPU Titan V). I still wait for their improvements.

Did I make any mistake to use pre-trained to continue my training? It could be changed checkpoints in pre-trained? If anyone has any suggestions, it will be cool. Thank you in advance.

NVIDIA / OpenSeq2Seq

Use pre-trained Librispeech on Jasper to another dataset #479