yandexdataschool / AgentNet

Deep Reinforcement Learning library for humans
http://agentnet.rtfd.org/
Other
299 stars 72 forks source link

Learned baseline #23

Closed justheuristic closed 8 years ago

justheuristic commented 8 years ago

Implement an algorithm that learns common baseline for Q-values

http://arxiv.org/pdf/1301.2315.pdf

justheuristic commented 8 years ago

Yes and it might be nice to include it to the second tutorial on "some advanced tricks"

justheuristic commented 8 years ago

Similar approach was implemented via Advantage Actor-Critic.