RPegoud / jym

JAX implementation of RL algorithms and vectorized environments
MIT License
32 stars 2 forks source link

Add the CliffWalking environment and reproduce Sutton & Barto experiment #13

Closed RPegoud closed 11 months ago

RPegoud commented 12 months ago

The experiment aims to compare Q-learning and Sarsa (possibly expected Sarsa) on the Cliff Walking environment, to outline the difference in behaviours between the two algorithms.

Here are the results to reproduce: