julianser / hed-dlg-truncated

Hierarchical Encoder Decoder RNN (HRED) with Truncated Backpropagation Through Time (Truncated BPTT)
GNU General Public License v3.0
308 stars 129 forks source link

VHRED encoder and context RNN initialization from converged HRED #7

Closed nhooram closed 8 years ago

nhooram commented 8 years ago

Hello, thanks a lot for your work.

I am trying to train the VHRED model on the twitter data. In commit f7c93464251973201df01be08cd77305512f5f03 you added the Twitter VHRED prototype in state.py. In the comment above the Twitter VHRED prototype, it says that it was

pretrained as the HRED model with state 'prototype_twitter_HRED'.

I think this refers to this snippet from the VHRED paper:

the VHRED’s encoder and context RNNs are initialized to the parameters of the corresponding converged HRED models.

I am stuck figuring out how to initialize the VHRED model from a converged HRED model using train.py.

Here is what I tried:

  1. First train with the prototype prototype_twitter_HRED
  2. After convergence, I run
python train.py --prototype prototype_twitter_VHRED --resume Output/<run_id>_TwitterModel &> Model_Output_twitter_VHRED.txt

But it seems that training just continues from the saved model with the same HRED prototype. I would appreciate any input.

Regards.

julianser commented 8 years ago

Hi Hooram,

Thanks for your interest! In the future, please only create issues about specific problems in the code. If you have general questions about HRED or VHRED, you can email me directly.

Once you have trained the HRED, you can use the flags "reinitialize-latent-variable-parameters" and "reinitialize-decoder-parameters" to train as the VHRED model. For example:

python train.py --prototype prototype_twitter_VHRED --reinitialize-latent-variable-parameters --reinitialize-decoder-parameters --resume Output/<run_id>_TwitterModel &> Model_Output_twitter_VHRED.txt

Cheers,

Iulian Vlad Serban