Closed sam-iitj closed 7 years ago
Have you ever done that?
We have not investigated this. In principal, it could work! For supervised fine-tuning, you would add a classification layer after slicing the final timesteps features (or averaging them) and then use tensorflow's tf.nn.sparse_softmax_cross_entropy_with_logits
to optimize the output as a loss with tf.train.AdamOptimizer
. For generative fine-tuning, encoder.py
exposes the model's predictions as logits
which would use with the same loss/training and suitable targets (a 1 timestep shifted version of the inputs). To do truncated backpropagation properly you would need to sample suitably aligned training segments and persists the hidden states of the model in between training steps. encoder.py
demonstrates an admittedly complicated version of this in the transform function since it only computes features for 64 character subsequences at a time.
Hi,
I wanted know if there is a way we can fine tune this model for other datasets.