rl-tokyo / survey

強化学習論文のサーベイリポジトリ
13 stars 5 forks source link

PGQ: Combining policy gradient and Q-learning #5

Open sotetsuk opened 7 years ago

sotetsuk commented 7 years ago

https://arxiv.org/abs/1611.01626

sotetsuk commented 7 years ago

8/10

sotetsuk commented 7 years ago

議論・疑問・コメント

他に読むべき文献