huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving
MIT License
908 stars 184 forks source link

SMARTS reward and threshold_for_counting_wp #660

Open AlexLewandowski opened 3 years ago

AlexLewandowski commented 3 years ago

in smarts/core/sensors.py:941, threshold_for_counting_wp = 0.5 is set. My understanding of this variable is that it sets a minimum on the reward, by only reporting distances over the threshold. For example, if a vehicle only travels 0.01 for 5 time steps, they will receive 0 reward. When the vehicle begins to accelerate and travels 0.1 for 5 time steps, the agent will receive a reward of 0.55 on the very last time step. This is problematic because the reward can be sparse and is no longer a function of state to state transitions but of entire trajectories. Is there a reason for it to be set to 0.5? I would suggest we set this to 0.0 or allow for customization.

Gamenot commented 3 years ago

I can see why the base reward might not be very useful at increments of 0.5 in some cases.

I am not entirely sure why it was set this way except that there was a concern that we were adding rewards too frequently.

We cannot change the default reward without "breaking the interface" but I think this is a candidate for adding an lever in the AgentInterface to adjust this increment with a default at 0.5.