Transfer learning - Githubissues

flassTer commented 5 years ago

Is it possible to start transfer learning with more characters in the vocabulary file than the ASR model was already trained on?

blisc commented 5 years ago

We have never tried this. I can only offer some starting points that may or may not work.

You can try loading the entire encoder weights but create a new decoder from scratch and try training from there
You can load the entire model and add another layer in the decoder and starting training from there
You can load the final 1024 -> 29 labels layer and append [1024, num_new_chars] to the weight tensor (I am not sure how you can do this in TF though)
You can experiment doing any of the above but also freeze the encoder weights, and/or apply a warmup learning rate from 0

Depending on what you are trying to accomplish, it might simply be easier to either hard-code or train a neural LM to convert the model predictions to the desired transcripts. Eg, words -> numbers, dollar -> $, etc.

flassTer commented 5 years ago

Thank you @blisc

NVIDIA / OpenSeq2Seq

Transfer learning #467