keavil / AAAI18-code

The code of AAAI18 paper "Learning Structured Representation for Text Classification via Reinforcement Learning".
215 stars 81 forks source link

Is Actor-critic used here? #15

Open RizhaoCai opened 5 years ago

RizhaoCai commented 5 years ago

I am confused by your code.

In the paper, it is mentioned that a policy gradient method [1] is used. But more specifically, I think that is implemented by Actor-Critic.

If I am wrong, plz tell me.

[1] Sutton, R. S.; McAllester, D. A.; Singh, S. P.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS, 1057–1063.

MENGHAH commented 5 years ago

I am confused by your code.

In the paper, it is mentioned that a policy gradient method [1] is used. But more specifically, I think that is implemented by Actor-Critic.

If I am wrong, plz tell me.

[1] Sutton, R. S.; McAllester, D. A.; Singh, S. P.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS, 1057–1063.

I think it's more like DDPG.