Closed borgwang closed 6 years ago
Hi, the official cma has settings where it tries to minimize some given function. For gym environment, the task is to maximize some reward. That's why for the CMAES wrapper, it has been negated.
@hardmaru OK. Thanks :)
Hi, I was confused by the this line in es.py https://github.com/hardmaru/estool/blob/master/es.py#L115
reward_table = -np.array(reward_table_result)
The reward_table will be passed to tell() method as function_values. But why it assign a negative sign to raw rewards collected from rollouts?