DanielTakeshi / rl_algorithms

I am implementing a lot of reinforcement learning and imitation learning algorithms since I'm sick of reading about them but not really understanding them.
MIT License
51 stars 9 forks source link

G-learning, test with infinite horizon #2

Open DanielTakeshi opened 7 years ago

DanielTakeshi commented 7 years ago

It turns out that the G-learning paper doesn't use the episodic setting (at least for the cliff-world setting, which is my main concern). Let's write a new cliff-world environment which isn't episodic and see if this matches their results.