V2AI / Det3D

World's first general purpose 3D object detection codebse.
https://arxiv.org/abs/1908.09492
Apache License 2.0
1.48k stars 299 forks source link

create_data.py----Questions about gt_boxes and transform matrix #158

Closed gkd2020 closed 2 years ago

gkd2020 commented 2 years ago

I'm doing my work based on your code. Thanks for your code!

Instructions To Reproduce the Issue:

  1. Question about gt_boxes. Source code:

    gt_box = np.concatenate([[locs, dims, velocity[:, :2], -rots - np.pi / 2], axis=1])
    #gt_boxes = np.concatenate([locs, dims, rots], axis=1)

    I encountered the problem of velocity display for NaN. But I do not need it, so I change the code. After change:

    #gt_box = np.concatenate([[locs, dims, velocity[:, :2], -rots - np.pi / 2], axis=1])
    gt_boxes = np.concatenate([locs, dims, rots], axis=1)

    The gt_boxes comes from get_sample_data() , This function returns the data path and all annotations related to that sample_data. The boxes are transformed into the current sensor's coordinate frame. So the label has been transformed into the Lidar coordinate system? I'm confused about the -rots - np.pi / 2 . What is its function?

  2. Question about transform_matrix : After I run create_data.py. I got transform_matrix and ref_from_car, car_from_global, global_from_car, car_from_current.

    <1>The label has been transformed into the Lidar coordinate system, so these matrices are useless? <2>These matrices are different from KITTI calib(e.g. Tr_velo_to_cam , (R|t)). The point has [x, y, z, intensity, ring_index], but the matrix is 4X4. for example: ``` [[ 0.00203327 0.99970406 0.02424172 0.943713 ], [ -0.99970406 0.00217566 -0.00584864 0.982268], [ -0.92979249 -0.0243238 0.992373873 1.84023 ], [ 0. 0. 0. 1 ]] ``` What does the last line mean? How should I use it?

Many thanks!

poodarchu commented 2 years ago

can you organize all your questions into one place, otherwise I might miss some of your concerns?

gkd2020 commented 2 years ago

hi, @poodarchu Sorry, I reorganized my questions.

  1. My question is about transform labels into lidar coordinates. I want to get labels in lidar coordinates, the code did the following steps:

    ref_lidar_path, ref_boxes, _ = get_sample_data(nusc, ref_sd_token) 
    locs = np.array([b.center for b in ref_boxes]).reshape(-1, 3)
    dims = np.array([b.wlh for b in ref_boxes]).reshape(-1, 3)
    rots = np.array([quaternion_yaw(b.orientation) for b in ref_boxes]).reshape(-1, 1)
    gt_boxes = np.concatenate([locs, dims,  -rots - np.pi / 2], axis=1) ###
    # gt_boxes = np.concatenate([locs, dims, rots], axis=1)                 ###

    Your code use nuscenes devkitget_sample_data. It has converted the yaw angle to the lidar coordinate system. Maybe -rots - np.pi / 2 is no use?I just need to use gt_boxes = np.concatenate([locs, dims, rots], axis=1).

    
    def get_sample_data(
    nusc, sample_data_token: str, selected_anntokens: List[str] = None
    ) -> Tuple[str, List[Box], np.array]:
    """
    Returns the data path as well as all annotations related to that sample_data.
    Note that the boxes are transformed into the current sensor's coordinate frame.
    :param sample_data_token: Sample_data token.
    :param selected_anntokens: If provided only return the selected annotation.
    :return: (data_path, boxes, camera_intrinsic <np.array: 3, 3>)
    """
    
    # Retrieve sensor & pose records
    sd_record = nusc.get("sample_data", sample_data_token)
    cs_record = nusc.get("calibrated_sensor", sd_record["calibrated_sensor_token"])
    sensor_record = nusc.get("sensor", cs_record["sensor_token"])
    pose_record = nusc.get("ego_pose", sd_record["ego_pose_token"])
    
    data_path = nusc.get_sample_data_path(sample_data_token)
    
    if sensor_record["modality"] == "camera":
        cam_intrinsic = np.array(cs_record["camera_intrinsic"])
        imsize = (sd_record["width"], sd_record["height"])
    else:
        cam_intrinsic = None
        imsize = None
    
    # Retrieve all sample annotations and map to sensor coordinate system.
    if selected_anntokens is not None:
        boxes = list(map(nusc.get_box, selected_anntokens))
    else:
        boxes = nusc.get_boxes(sample_data_token)
    
    # Make list of Box objects including coord system transforms.
    box_list = []
    for box in boxes:
    
        # Move box to ego vehicle coord system
        box.translate(-np.array(pose_record["translation"]))
        box.rotate(Quaternion(pose_record["rotation"]).inverse)
    
        #  Move box to sensor coord system
        box.translate(-np.array(cs_record["translation"]))
        box.rotate(Quaternion(cs_record["rotation"]).inverse)
    
        box_list.append(box)
    
    return data_path, box_list, cam_intrinsic

2. Another question is the velocity value is NaN.