hustvl / MapTR

[ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
MIT License
984 stars 152 forks source link

Possible Error in av2_map_dataset.py #107

Closed Zhutianyi7230 closed 9 months ago

Zhutianyi7230 commented 9 months ago

In line 541 in projects\mmdet3d_plugin\datasets\av2_map_dataset.py when generating labels:

        def gen_vectorized_samples(self, location, map_elements, lidar2global_translation, lidar2global_rotation):
        '''
        use lidar2global to get gt map layers
        av2 lidar2global the same as ego2global
        location the same as log_id
        '''

You may think the lidar frame the same as the ego vechicle frame? But according to the description of the coordinate system in the Argoverse2 paper, the vehicle coordinate system is not the same as the lidar coordinate system. Therefore is there an error in the transformation matrix in the subsequent code? The following picture explains the coordinate system of Argoverse2: https://img-blog.csdnimg.cn/img_convert/03a336225473f0f80bc90a8ad8202ba4.png

LegendBC commented 9 months ago

Thanks for your interest! First, we have not used av2_map_dataset.py, we use av2_offlinemap_dataset.py. Second, we pass e2g to this function here. The disturbing args are inherited from nuScenes map dataset. So the implementation is right. Sorry for the misleading.

Zhutianyi7230 commented 9 months ago

Thank you for your feedback! I still have two more questions 1. where is the av2_offlinemap_dataset.py you mentioned? According to maptr_tiny_r50_av2_24e.py I think the dataset used in training and testing Argoverse2 should be the class CustomAV2LocalMapDatasetin av2_map_dataset.py, am I right?

  1. in this line of code in 1136 in av2_map_dataset.py,

    lidar2cam_rt = cam_info['extrinsics']

I'm afraid you still think ego=lidar,and directly assign the camera extrinsics to lidar2cam_rt, is it correct to do so? I think this will affect the GKT calculation.

Additionally, I can see this function:

anns_results = self.vector_map.gen_vectorized_samples(location, map_elements, e2g_translation, e2g_rotation)

It's OK. Thanks again.

LegendBC commented 9 months ago

Thank you for your feedback! I still have two more questions 1. where is the av2_offlinemap_dataset.py you mentioned? According to maptr_tiny_r50_av2_24e.py I think the dataset used in training and testing Argoverse2 should be the class CustomAV2LocalMapDatasetin av2_map_dataset.py, am I right? 2. in this line of code in 1136 in av2_map_dataset.py,

lidar2cam_rt = cam_info['extrinsics']

I'm afraid you still think ego=lidar,and directly assign the camera extrinsics to lidar2cam_rt, is it correct to do so? I think this will affect the GKT calculation.

Additionally, I can see this function:

anns_results = self.vector_map.gen_vectorized_samples(location, map_elements, e2g_translation, e2g_rotation)

It's OK. Thanks again.

av2_offlinemap_dataset.py is in maptrv2 branch. In both v1 and v2 branch, we treat the lidar coordinate system as ego coordinate system in av2 dataset, since we do not use lidar point cloud. In practice, We crop the map using ego coordinate system, the camera extrinsic is also in ego coordinate system. So the implementation is aligned, which is also proved in our visualizations.

Zhutianyi7230 commented 9 months ago

OK, I totally get it, thank you very much for the clarification. So lidar2cam actually means ego2img.

But I found another problem. It's about the process of generating the 'lidar2img' transformation matrix in img_metas during training. In av2_map_dataset.py at line 1048,we first get the input_dict['lidar2img']:

input_dict = self.get_data_info(index)

I think this is the appropriate transform matrix as far as images of 2048 x 1550 size are concerned. But then during the train_pipeline in CustomLoadMultiViewImageFromFiles and PadMultiViewImage, we pad the image, at which point the camera's internal parameters must have changed, but I don't see relative code for the lidar2cam matrix to change accordingly. Thanks again.