open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.18k stars 1.52k forks source link

3D gtbbox #2437

Open wistful-8029 opened 1 year ago

wistful-8029 commented 1 year ago

When I project the 3d bbox I read onto the image using the built-in method, there is an obvious misalignment. Why is this?

JingweiZhang12 commented 1 year ago

Could you provide more details about your question?

wistful-8029 commented 1 year ago

According to the points, img_metas and gt_bbox_3d collected in the configuration file, I used plt to draw the x and y coordinates of the point cloud and the x and y coordinates of gt_bbox_3d. I found that the positions were not aligned. When I used the built-in api to project gt_bboxes onto an image, it was also not aligned. I think this is the reason for data enhancement, but I feel that the data enhancement should be synchronous, the coordinates of point cloud and gt_bbox should be transformed at the same time, I don't know what to do

JingweiZhang12 commented 1 year ago

Did you plot the data of the common dataset? Are there some visualized pictures?

wistful-8029 commented 1 year ago

https://cdn.jsdelivr.net/gh/wistfully8029/myPic/202304181013198.png As shown in this picture, I use the built-in api to draw, orange is the gt bbox read, blue is the predicted bbox, I don't understand why the orange is offset

JingweiZhang12 commented 1 year ago

Could you provide a snippet of the code, so we can reproduce it?

wistful-8029 commented 1 year ago

code 1:

from mmdet3d.apis import init_model

config_file = '/home/wistful/work/mmdetection3d/configs/my_config/my_dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py'
chekpoint_file = '/home/wistful/ResultDir/my_pth/mxvnet/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class_20210831_060805-83442923.pth'
model = init_model(config_file, chekpoint_file, device='cuda:1')

from mmdet3d.datasets import build_dataset
from mmcv import Config
import os

os.chdir('/home/wistful/work/mmdetection3d/')
config_file = 'configs/_base_/datasets/kitti-3d-3class-multi.py'
cfg = Config.fromfile(config_file)

datasets = [build_dataset(cfg.data.train)]
from mmdet3d.core.visualizer import show_multi_modality_result
import cv2
from torchvision.transforms import transforms
import matplotlib.pyplot as plt
from mmdet3d.apis import inference_detector, show_result_meshlab, inference_multi_modality_detector
from mmdet3d.apis.inference import show_proj_det_result_meshlab

out_dir = "/home/wistful/work/mmdetection3d/visual_img/kitti/"
pkl_dir = '/home/wistful/work/mmdetection3d/data/kitti/kitti_pkl_output/'
num = 50
for i in range(0, num, 1):
    cur_data = datasets[0][i]
    img_metas = cur_data.get('img_metas').data
    pts_file = img_metas.get('pts_filename') 
    img = cur_data.get('img').data
    img_file_path = img_metas.get('filename')
    name = img_file_path.split('/')[-1].split('.')[0]
    ann_file = pkl_dir + 'kitti_' + name + '_infos.pkl'
    # print(ann_file)

    image = cv2.imread(img_file_path)
    gt_bboxes = cur_data.get('gt_bboxes_3d').data
    project_mat = img_metas.get('lidar2img')

    result, data = inference_multi_modality_detector(model, pts_file, img_file_path, ann_file) 
    bboxes_data = result[0]['pts_bbox']['boxes_3d']

    show_result_meshlab(data, result, out_dir, show=False) 

    show_multi_modality_result(img=image,
                               box_mode='lidar',
                               gt_bboxes=gt_bboxes,
                               img_metas=img_metas,
                               pred_bboxes=bboxes_data,
                               proj_mat=project_mat,
                               out_dir="/home/wistful/work/mmdetection3d/visual_img/kitti/",
                               filename=name,
                               show=False)

I implemented the above code with jupyter. Sorry, it looks a little messy. In the above code I used a configuration file to read the point cloud file and the ground truth, then I used a detector to get the prediction, and finally I used show_multi_modality_result to produce the visualization, visualized as follows https://cdn.jsdelivr.net/gh/wistfully8029/myPic/202304191419477.png, Orange is the ground truth, blue is the predicted value, and the ground truth is always biased

code 2:

from mmdet3d.datasets import build_dataset
from mmcv import Config
import os

from mmdet3d.apis import init_model

os.chdir('/home/wistful/work/mmdetection3d/')

# config_file = 'configs/my_config/no_data_enhancement.py'
config_file = 'configs/my_config/dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py'
cfg = Config.fromfile(config_file)

datasets = [build_dataset(cfg.data.train)]
import matplotlib.pyplot as plt
from mmdet3d.models.fusion_layers.coord_transform import apply_3d_transformation
item = 4
point = datasets[0][item].get('points').data
gt_bboxes_3d = datasets[0][item].get('gt_bboxes_3d').data.tensor
img_meta = datasets[0][item].get('img_metas').data

point = apply_3d_transformation(point[:,:3], "LIDAR", img_meta, reverse=True)
# ont_gt = apply_3d_transformation(ont_gt[:,:3], "LIDAR", img_meta, reverse=True)
# ont_gt = selected_points
x = point[:, 1]
y = point[:, 0]
plt.scatter(x, y, marker='.', s=0.01)

# ont_gt = ont_gt[:3, :3] @ img_metas[batch_id].get('pcd_rotation')
gt_x = gt_bboxes_3d[:, 1]
gt_y = gt_bboxes_3d[:, 0]
plt.scatter(gt_x, gt_y, marker='*', s=0.9, c='red')

plt.show()

The visualization is as follows: the red dot is the center point of the ground truth, with a clear deviation https://cdn.jsdelivr.net/gh/wistfully8029/myPic/202304191429776.png

Did you notice the initial configuration file in code 2? To switch the configuration file to configs/my_config/ no_data_enhant.py, I removed the data enhancement from dv_mvx-fpn_second_secfpn_adamw_2x8_80e_kitti-3d-3class.py, Now let's look at the visualization https://cdn.jsdelivr.net/gh/wistfully8029/myPic/202304191429070.png

Am I having some problems with this function? from mmdet3d.models.fusion_layers.coord_transform import apply_3d_transformation

My mmdet3d related versions are as follows TorchVision: 0.9.1+cu111 OpenCV: 4.6.0 MMCV: 1.6.2 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMDetection: 2.28.1 MMSegmentation: 0.30.0 MMDetection3D: 1.0.0rc6+47285b3 spconv2.0: True

jiaying0w0 commented 8 months ago

Hi, i have the same problem, did you find the way to solve it? Thank you for any help.