Closed fangchuan closed 5 years ago
Hi @fangchuan ,
That reward function tries to motivate the agent to drive towards the goal with as much high speed as possible without colliding, driving on wrong lanes or on sidewalks. The coefficients were tuned to have a good, scalar reward value that is good enough to let the agent learn to drive well.
The reward calculation doesn't take into account the traffic lights because the traffic light support doesnt exist in an usable way (in the stable version of Carla). The speed limits weren't imposed but the maximum speed of the vehicle is limited by the physics (taken care by the simulator) and so it didn't have a huge negative impact.
The reward values are calculated in a very similar way as the Carla paper for bechmarking and comparison purposes. If you are interested in designing a comprehensive reward function calculation, you may want to start off with a basic version (like the one used in this code-base) and then add more terms one by one and tune the coefficients until you achieve a reasonable reward/penalty.
Some additional factors to consider:
thanks you! I'm trying to tune these coefficients of The reward equation
Hi, recently i have been concentrated on training my agent in carla, it seemed my agent based on dqn did not bad. But i still cannot understand why you calculate the reward in this way:
https://github.com/PacktPublishing/Hands-On-Intelligent-Agents-with-OpenAI-Gym/blob/master/ch8/environment/carla_gym/envs/carla_env.py `
` Is it a really well-considered solution to calculate reward? like does it think of the situation of traffic light,limit of speed and things like that. and what is each coefficient meaning? I want to formulate a comprehensive way to calculate reward, but i donot have any good idea, i'm looking forward to your reply. @praveen-palanisamy