Closed bikestra closed 9 years ago
Sorry I missed your comment in the paper that you are doing teacher forcing.
Yeah. I would be way better to not use teacher force. On Wed, Nov 18, 2015 at 7:26 PM Hyokun Yun notifications@github.com wrote:
Sorry I missed your comment in the paper that you are doing teacher forcing.
— Reply to this email directly or view it on GitHub https://github.com/wojciechz/learning_to_execute/issues/9#issuecomment-157907990 .
In the
show_predictions()
function ofmain.lstm
[https://github.com/wojciechz/learning_to_execute/blob/master/main.lua#L186], the actual ground-truth is fed into RNN:While making predictions, should you use
argmax(pred_vector)
of previous iteration instead ofstate.data.x[state.pos]
to make sure that the algorithm cannot look at the ground-truth before outputting EOS symbol and finishing the generation of output sequence? Otherwise the algorithm is only challenged to predict one letter at a time, while the purpose is to produce the whole sequence. I guess ideally one would want to do a beam search.I apologize if I misunderstood your intent.