Closed zhangzibin closed 7 years ago
I am also interested in some notes on this - the idea is clear, but I cannot find a paper.
The algorithm which was implemented is called REINFORCE. Its application to sequence to sequence models is described here: https://arxiv.org/abs/1511.06732
Hi, Thanks for you good work. I found you code support reinforcement learning. Can you give more detail? What paper you implement?