modestyachts / ARS

An implementation of the Augmented Random Search algorithm
Other
420 stars 103 forks source link

About SHIFT #8

Open yichaowa opened 5 years ago

yichaowa commented 5 years ago

I have no idea about why we need to subtract a shift from reward, and how to set this value?

dryanguasr commented 5 years ago

They explain that in the article... the idea is to supress the survival bonus from the reward function in order to avoid some local optima. In hopper the survival bonus is 1 per step so shift is set to 1 and in humanoid it is 5 per sted so shift is set to 5. It is even commented in the ars.py file:

# for Swimmer-v1 and HalfCheetah-v1 use shift = 0
# for Hopper-v1, Walker2d-v1, and Ant-v1 use shift = 1
# for Humanoid-v1 used shift = 5