keon / policy-gradient

Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
MIT License
159 stars 43 forks source link

Train agent process error #6

Open SZH1230456 opened 4 years ago

SZH1230456 commented 4 years ago

I think the code in the agent training process https://github.com/keon/policy-gradient/blob/b83f050b70fd0af3358a0f2748743d30c0e7462f/pg.py#L56 has some errors. The calculation process can not get right result. You can figure it out and refer: https://github.com/gabrielgarza/openai-gym-policy-gradient/blob/master/policy_gradient_layers.py#