Potentially wrong reward

duckietown / gym-duckietown

Self-driving car simulator for the Duckietown universe

http://duckietown.org

Other

45 stars 16 forks source link

Potentially wrong reward #237

Open Max-Fu opened 3 years ago

Max-Fu commented 3 years ago

Inside the gym environment, there are two robot speed: self.speed and self.robot_speed; while self.robot_speed is set to a constant, self.speed is the true speed. Yet in the reward function, the function calls self.robot_speed instead of self.speed (check this). I think this creates the reward mis-specification problem (i.e. DDPG learns trivial policy). Can one of the repo creators check if this is indeed an error? Thanks! (I just restarted my run and will check if this solve the issue.)

CourchesneA commented 3 years ago

@Max-Fu I think there has not been a lot of test and tuning of that reward function. Please submit a PR if you can improve the current version