zhejz / carla-roach

Roach: End-to-End Urban Driving by Imitating a Reinforcement Learning Coach. ICCV 2021.
https://zhejz.github.io/roach
Other
274 stars 50 forks source link

Question about history information #24

Closed atg93 closed 1 year ago

atg93 commented 1 year ago

Hi,

First of all, End-to-End Urban Driving by Imitating a Reinforcement Learning Coach is an excellent work and thank you for sharing your work and your code. However, I couldn't understand some points in the paper. I would appreciate it if you help me.

In my understanding, history information of vehicles and pedestrians are given to bev space while training the roach expert. On the other hand, it looks like it might be conflicted with Markov Models since the concept of Markov model is to give only the current state. What is the motivation of adding history information in state ? What kind of advantage does it brings to the rl model ?

In addition, the imitation model of the paper does not use any history information. Do you think this might cause a problem ? Rl expert can extract additional information by using history information. On the other hand, the imitation model does not receive any of these information in its state. However, we expect the similar behaviours from both models.

zhejz commented 1 year ago

The problem is not Markovian because our BEV formulation is incomplete. Given only the current frame of BEV it’s not possible to infer the velocity and the yaw rate of other traffic participants. Therefore, we have to include the history information as input, and that works well in practice. You can make the problem Markovian by, for example, adding velocity and yaw rate as extra channels to the BEV. In that case the history information is not needed any more.

Our IL model uses single camera image as input and does not use any history information. Indeed this will affect the performance. If you use multiple cameras and stack history frames, then the IL model should perform even better but it also takes longer to train the model. It’s just in the paper we took the simpler approach and confirmed it actually worked good enough.