SMARTS reward and threshold_for_counting_wp

huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving

MIT License

908 stars 184 forks source link

in smarts/core/sensors.py:941, threshold_for_counting_wp = 0.5 is set. My understanding of this variable is that it sets a minimum on the reward, by only reporting distances over the threshold. For example, if a vehicle only travels 0.01 for 5 time steps, they will receive 0 reward. When the vehicle begins to accelerate and travels 0.1 for 5 time steps, the agent will receive a reward of 0.55 on the very last time step. This is problematic because the reward can be sparse and is no longer a function of state to state transitions but of entire trajectories. Is there a reason for it to be set to 0.5? I would suggest we set this to 0.0 or allow for customization.

huawei-noah / SMARTS

SMARTS reward and threshold_for_counting_wp #660