VHRED encoder and context RNN initialization from converged HRED

julianser / hed-dlg-truncated

Hierarchical Encoder Decoder RNN (HRED) with Truncated Backpropagation Through Time (Truncated BPTT)

GNU General Public License v3.0

308 stars 129 forks source link

Hello, thanks a lot for your work.

I am trying to train the VHRED model on the twitter data. In commit f7c93464251973201df01be08cd77305512f5f03 you added the Twitter VHRED prototype in state.py. In the comment above the Twitter VHRED prototype, it says that it was

pretrained as the HRED model with state 'prototype_twitter_HRED'.

I think this refers to this snippet from the VHRED paper:

the VHRED’s encoder and context RNNs are initialized to the parameters of the corresponding converged HRED models.

I am stuck figuring out how to initialize the VHRED model from a converged HRED model using train.py.

Here is what I tried:

First train with the prototype prototype_twitter_HRED
After convergence, I run

python train.py --prototype prototype_twitter_VHRED --resume Output/<run_id>_TwitterModel &> Model_Output_twitter_VHRED.txt

But it seems that training just continues from the saved model with the same HRED prototype. I would appreciate any input.

Regards.

julianser / hed-dlg-truncated

VHRED encoder and context RNN initialization from converged HRED #7