autonomousvision / transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving; [CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
MIT License
1.14k stars 187 forks source link

I'm confused about pi/2 in data.py #81

Closed muyuuuu closed 2 years ago

muyuuuu commented 2 years ago

https://github.com/autonomousvision/transfuser/blob/9d5ebf01f7b5f6e5c02ebe0cba423119c465b388/transfuser/data.py#L228-L252

  1. in line 233 used np.pi/2-seq_theta[i]
  2. in line 250 used np.pi/2+ego_theta

what's difference between them? After read your code, I think ego_theta and seq_theta[i] are the same. x and x_command are also the same.

  1. in line 239, why full_lidar[i][:,1] *= -1 ?

Could you explain it clearly? thx.

ap229997 commented 2 years ago

see #57

muyuuuu commented 2 years ago

Sorry, I have seen that, but I really don't know why......

Kait0 commented 2 years ago

1-2. The purpose of this code was I think to realign the LiDAR point clouds from multiple time steps into the same coordinate system. Because the code is using self.seq_len = 1 there are no past time steps to realign. So like you said ego_theta and seq_theta[i] are the same and that is correct.

  1. I don't think there is a good why. It flips the y coordinate axis. Think this arose from some confusion about coordinate systems and then was never fixed. You can find it in multiple places in the code so it should be consistent.
muyuuuu commented 2 years ago

So....... Is there some math manual or coordinate system introduction to let me understand np.pi/2-seq_theta[i] and np.pi/2+ego_theta other than np.pi/2+seq_theta[i] and np.pi/2-ego_theta ?

Kait0 commented 2 years ago

I am not entirely sure what you are confused about. np.pi/2 is a short form of 90 * np.pi / 180. The first one is a a rotation by 90 degree and the second part converts from degree to radian. So np.pi/2 just represents 90° in radian. I don't really have a recommendation for books on coordinate systems and their transformations, but it should not be too hard to find one on google.

ap229997 commented 2 years ago

To add to @Kait0's comment from earlier, the code is written to support multi-timestep input and multi-timestep prediction in the future. So, seq_fronts/lefts/rights/rears, seq_lidars, seq_x, seq_y, seq_theta are all sequences in which the initial timesteps represent the past and the final timestep represents the current timestep. In L198, we have a for loop to parse the input sequences so the value of i at the end of for loop corresponds to the current timestep. Hence, in L217-219, i also corresponds to the current timestep of the ego-vehicle. Ideally, I think we should have used -1 instead of i in L217-219 to avoid confusion, sorry about that.

Since we have inputs from different coordinate systems, we transform them into the coordinate system at the current timestep using transform_2d_points. While transforming the coordinate systems, we noticed misalignment (some axes were inverted and there was also a rotation mismatch) in the coordinate axes of the LiDAR points compared to the default CARLA coordinate system. That's why we have np.pi/2 and *= -1 fixes in multiple places. These fixes are valid for LiDAR and are specific to CARLA 0.9.10. There is no manual to understand these fixes, this is just something we came up with based on visualizing the alignment issue.

The other confusing aspect is np.pi/2-seq_theta[i] in L233 and np.pi/2+ego_theta in L250. There are 2 things to note here:

muyuuuu commented 2 years ago

this is just something we came up with based on visualizing the alignment issue.

I got it. Thank's for your contribution very much. I think I should learn CARLA.