Closed rapop closed 4 years ago
Hi,
Sorry for not seeing this earlier, just happened to notice the issue today.
The lunar lander problem was solved using the same code as the bipedal walker (located at ./gym/bipedalwalker-v2/ppo.py
). The same hyperparameters should converge in the lunar lander environment, but it is probably a good idea to experiment with a smaller network etc. since the problem is a bit easier in terms of the size of action- and state spaces.
If you have any other questions don't hesitate to reach out, but I'll close this issue for now!
Hi, Did you use the same code for lunar lander? (LunarLanderContinuous-v2) Are the hyperparameters the same? I am trying to implement PPO also on lunar lander continuous and it did seems to converge
Thanks