keiohta / tf2rl

TensorFlow2 Reinforcement Learning
MIT License
464 stars 104 forks source link

Implement VPG #15

Open keiohta opened 5 years ago

keiohta commented 5 years ago

Policy Gradient Methods for Reinforcement Learning with Function Approximation

keiohta commented 5 years ago

Need to design carefully so that both on-policy and off-policy agents coexist in Trainer

keiohta commented 5 years ago
keiohta commented 5 years ago

Compare followings: