Training on BipedalWalkerHardcore seems to result in a negative reward

modestyachts / ARS

An implementation of the Augmented Random Search algorithm

Other

418 stars 102 forks source link

Training on BipedalWalkerHardcore seems to result in a negative reward #7

Open kirk86 opened 5 years ago

kirk86 commented 5 years ago

Hi and thanks for sharing the code. I've tried to run the training process on a different environment such as the BipedalWalkerHardcore-v2 but it seems that is not able to learn anything. I even tried with different shift values as noted in the code comments but still in the end I get a negative reward. Should we train for longer or there any hyperparams that we are missing?

ar8372 commented 2 years ago

Hey @kirk86 , I am having similar issue did you solve it? Do look at this thread for my exact issue.