xianhong / DDPG-TORCS

Reinforcement learning of driving a racing car in TORCS using DDPG algorithm
15 stars 10 forks source link

Question about Learning to brake #2

Open hz3014 opened 5 years ago

hz3014 commented 5 years ago

Hi, I used your code and trained a decent agent but it doesn't brake, I am now trying to implement stochastic brake. I was wondering do i need to uncomment both line 94-99 and line 105-112 in ddpg.py? I noticed line 105-112 is something Yanpanlau's code didn't have. What is the advantage of having that additional code? Thanks

xianhong commented 5 years ago

No need to uncomment lines between 105 and 112. Back then, i also found it difficult to have the car learn using brakes.
Based on Yanpanlau's post, the formula of reward calculation is to encourage the car to have the "maximum" speed along the track axis. In TORCS, as far as accumulated rewards are concerned ,the car learning to drive at full speed (ex: pressing hard on gas pedal) only by turning steering wheel seems more optimal than the car learning to drive through controlling steering wheel, gas pedal and brake at the same time because more brake usage in the latter case will inevitably slow down car's speed. In real world, driving at full speed makes it difficult to keep the car on track. The reward system we use does not take that difficulty into consideration. If the reward calculation could be tweaked to factor maneuvering difficulty into consideration by introducing some kind of penalty for car's high speed, then chances for the car to use brake more often might be higher.