steveKapturowski / tensorflow-rl

Implementations of deep RL papers and random experimentation
Apache License 2.0
177 stars 47 forks source link

Wrong check for 'reward_threshold' property in cem_actor_learner.py #5

Closed captify-alapite closed 7 years ago

captify-alapite commented 7 years ago

In the train method of the CEMLearner, there's the following check on line 86-89:

if elite_mean_reward > self.emulator.env.spec.reward_threshold:
    consecutive_successes += 1
else:
    consecutive_successes = 0

Unfortunately, the reward_threshold often evaluates to None (e.g. with Pendulum-v0) and consequently the inequality check succeeds, leading to premature halting of the CEM training.

steveKapturowski commented 7 years ago

Fixed: de2e6b324ab3c8b6139f898cf0e40357541986ff