opendilab / InterFuser

[CoRL 2022] InterFuser: Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Apache License 2.0
531 stars 46 forks source link

Network details #36

Closed rockstarsir closed 1 year ago

rockstarsir commented 1 year ago
  1. I see that the network is dependent on actor details json as input for predicting the velocity. Am i correct?
  2. Can i skip using the actor details for predicting the velocity.
deepcs233 commented 1 year ago

Hi!

  1. We use the velocity from actor details json file as the ground truth to supervise the network
  2. No, the data from actor details json is needed to be used as GT.
rockstarsir commented 1 year ago

Hi,

Appreciate you for taking your time to reply.

I have few more clarifications, can you please help me with them

  1. What is the significance of using waypoints.npy file, as you are already taking the future waypoints in the measurements json.
  2. what are the commands like x_command, y _command, command, gt_command that are in the measurements file, especially I see you are using the command key from the file, doing one hot encoding and adding speed to it . what is the significance of it?
  3. what is the the third value in future waypoints? I see the third value to be 4 mostly.
  4. How are you calculating the velocity loss with out using the velocity ground truth. In particular point you are using something "loss_traffic, loss_velocity = loss_fns["traffic"](output[0], target[4])". It is bit confusing how using different indexes positions of output and target you are arriving at the velocity loss.

Thanks in advance

deepcs233 commented 1 year ago

Hi, thanks for your interest in our project!

  1. Actually, intertuser doesn't use this file. It collect the future waypoints of the ego car in one file to speed up the io speed. But in our project, we only use the future waypoints that the navigation algorithm provided, not the ego car produced.
  2. About these four value, you can refer to https://github.com/opendilab/InterFuser/blob/e0682c350892a243cf40bf448622743f4b26d0f3/leaderboard/team_code/base_agent.py#L431. The first two are the coordinate of the target points, the last two are two commands (ex. turn left, right, ...) from two different plannner. Using the key and speed in the network is to provide more information for the ego car.
  3. Sorry i can't get what you mean, can you provide more information? may be z-axis or commands ?
  4. target[4] includes the velocity ground truth.
rockstarsir commented 1 year ago

Hi,

Thanks for your reply. Your efforts are appreciable.

  1. My third point I mentioned in the above comment is about 3rd value in the future waypoints list which are available in the measurements json file. I figured out that it represents the commands anyway.
  2. Can you help me with the rough picture of network if u have. I saw a figure in the paper but I am trying to get deep understanding on how the data is flowing especially measurements and actors data. It would be helpful if u provide network architecture depiction.
  3. What is the significance of variable end2end in InterFuser class?

Thanks in advance for your reply. I will come up with more doubts :)

deepcs233 commented 1 year ago

Hi,

  1. It denotes command, 4 means "Go straight", so is the most.
  2. You may combine the paper and code (https://github.com/opendilab/InterFuser/blob/e0682c350892a243cf40bf448622743f4b26d0f3/interfuser/timm/models/interfuser.py#LL980C9-L980C16) to help you understand. Here, self.direct_concat = False, self.end2end = False, self.waypoints_pred_head = "gru". The model first utilizes a transformer encoder to fuse data from different sources. Then it uses another transformer decoder to get different types of output, like traffic sign, waypoints, etc. Actors data is used to render a heatmap (20 20 7) to supervise the model. The heatmap includes the information of the nearby actors (position, bbox, direction, vvelocity).
  3. If we set end2end as True, the model will directly output the waypoints, without other heatmap, traffic sign, etc.
rockstarsir commented 1 year ago

thanks