modestyachts / ARS

An implementation of the Augmented Random Search algorithm
Other
418 stars 103 forks source link

Divide by zero #1

Open pedronahum opened 6 years ago

pedronahum commented 6 years ago

Hi, First and foremost, thanks for sharing the code. This is greatly appreciated.

Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).

# normalize rewards by their standard deviation
rollout_rewards /= np.std(rollout_rewards)

Thanks,

hari-sikchi commented 5 years ago

I experienced this kind of difficulties in all sparse reward setting. Is ARS a good way to go for these optimization landscapes?

ashutoshtiwari13 commented 5 years ago

Can we use a .clip(min=1e-2) to avoid that ?

pedronahum commented 5 years ago

In my case, adding 1e-8 to the divisor made the trick...

ashutoshtiwari13 commented 5 years ago

yeah @pedronahum , that would do it too!