Adding continuous brake control in the Carla Env.py

sheelabhadra / Learning2Drive

Implementation of the paper "Learning to Drive in a day" in CARLA.

MIT License

8 stars 1 forks source link

Adding continuous brake control in the Carla Env.py #3

Closed Jidenna closed 4 years ago

Jidenna commented 4 years ago

Thanks for the fantastic work you have done here.

So I am trying to add brake to the agent continuous control. I noticed from your code in env.py you set:

control.brake = 0. which is understandable to not need a brake control for a simple lane following task. So when I tried to add other cars and pedestrians, the agent never learns to brake and always collides.

I have tried to set control.brake = float(np.abs(np.clip(action[1], -1, 0))) but it doesn't seem to work. Is this because you are using the control() from carla_server_pb2 rather than the standard VehicleControl() from carla.client?

If yes then how do I go about adding a continuous brake control into your code?

Thanks

sheelabhadra commented 4 years ago

Hi @Jidenna, in my experiments, I restricted the throttle to be between 0.4 and 0.6 (line 21 in env.py). If you wish to add brake control, you could skip the code in lines 105-108 in env.py. I think setting the throttle and brake in the following manner should work.

control.throttle = float(np.clip(action[1], 0, 1))
control.brake = float(np.abs(np.clip(action[1], -1, 0)))

Jidenna commented 4 years ago

Thanks for taking the time to response to this.

So the brake control now works like you suggested it would. However the vehicle brakes too often and most times, it doesn't move for several timestamp but still accumulates a good reward.

I have added a reward to punish the agent for not moving using the following code:

self.nospeed_times = 0
forward_speed = measurements.player_measurements.forward_speed
if forward_speed*3.6 <= 1.0: # convert speed from m/s to km/h
            self.nospeed_times+=1
            if self.nospeed_times>200: # if it gets stuck for 200 steps then reset and punish 
                done = True
            reward = -1

However this is still not a viable solution to fulfilling my ultimate goal which is to make the agent learn to brake before colliding against other dynamic vehicle. This was why I needed the brake control in the first place

Is there a workaround to achieve this? Sorry i am still new to this and I would really appreciate your help.

sheelabhadra commented 4 years ago

Yes, the problem with adding brake control is that it takes time for the agent to understand that braking often is bad. If you also need to handle dynamic obstacles, you would need to design a much more complex reward function than what I have used. I would suggest you take a look at this repo for some ideas. Hope that helps!

Jidenna commented 4 years ago

Thanks for all the help.

Aarnnity commented 3 years ago

Thanks for all the help.

Hi, can you share the code about DDPG? I saw your message, you added a braking behavior for continuous control, I want to learn. Hope to get your reply.