create_data.py----Questions about gt_boxes and transform matrix

gkd2020 commented 2 years ago

I'm doing my work based on your code. Thanks for your code!

Instructions To Reproduce the Issue:

Question about gt_boxes. Source code：
```
gt_box = np.concatenate([[locs, dims, velocity[:, :2], -rots - np.pi / 2], axis=1])
#gt_boxes = np.concatenate([locs, dims, rots], axis=1)
```
I encountered the problem of velocity display for NaN. But I do not need it, so I change the code. After change:
```
#gt_box = np.concatenate([[locs, dims, velocity[:, :2], -rots - np.pi / 2], axis=1])
gt_boxes = np.concatenate([locs, dims, rots], axis=1)
```
The gt_boxes comes from get_sample_data() , This function returns the data path and all annotations related to that sample_data. The boxes are transformed into the current sensor's coordinate frame. So the label has been transformed into the Lidar coordinate system？ I'm confused about the -rots - np.pi / 2 . What is its function？
Question about transform_matrix : After I run create_data.py. I got transform_matrix and ref_from_car, car_from_global, global_from_car, car_from_current.
<1>The label has been transformed into the Lidar coordinate system, so these matrices are useless？ <2>These matrices are different from KITTI calib(e.g. Tr_velo_to_cam , (R|t)). The point has [x, y, z, intensity, ring_index], but the matrix is 4X4. for example: ``` [[ 0.00203327 0.99970406 0.02424172 0.943713 ], [ -0.99970406 0.00217566 -0.00584864 0.982268], [ -0.92979249 -0.0243238 0.992373873 1.84023 ], [ 0. 0. 0. 1 ]] ``` What does the last line mean？ How should I use it？

Many thanks！

poodarchu commented 2 years ago

can you organize all your questions into one place, otherwise I might miss some of your concerns?

gkd2020 commented 2 years ago

hi, @poodarchu Sorry, I reorganized my questions.

My question is about transform labels into lidar coordinates. I want to get labels in lidar coordinates, the code did the following steps：

ref_lidar_path, ref_boxes, _ = get_sample_data(nusc, ref_sd_token) 
locs = np.array([b.center for b in ref_boxes]).reshape(-1, 3)
dims = np.array([b.wlh for b in ref_boxes]).reshape(-1, 3)
rots = np.array([quaternion_yaw(b.orientation) for b in ref_boxes]).reshape(-1, 1)
gt_boxes = np.concatenate([locs, dims,  -rots - np.pi / 2], axis=1) ###
# gt_boxes = np.concatenate([locs, dims, rots], axis=1)                 ###

Your code use nuscenes devkitget_sample_data. It has converted the yaw angle to the lidar coordinate system. Maybe -rots - np.pi / 2 is no use?I just need to use gt_boxes = np.concatenate([locs, dims, rots], axis=1).


def get_sample_data(
nusc, sample_data_token: str, selected_anntokens: List[str] = None
) -> Tuple[str, List[Box], np.array]:
"""
Returns the data path as well as all annotations related to that sample_data.
Note that the boxes are transformed into the current sensor's coordinate frame.
:param sample_data_token: Sample_data token.
:param selected_anntokens: If provided only return the selected annotation.
:return: (data_path, boxes, camera_intrinsic <np.array: 3, 3>)
"""

# Retrieve sensor & pose records
sd_record = nusc.get("sample_data", sample_data_token)
cs_record = nusc.get("calibrated_sensor", sd_record["calibrated_sensor_token"])
sensor_record = nusc.get("sensor", cs_record["sensor_token"])
pose_record = nusc.get("ego_pose", sd_record["ego_pose_token"])

data_path = nusc.get_sample_data_path(sample_data_token)

if sensor_record["modality"] == "camera":
    cam_intrinsic = np.array(cs_record["camera_intrinsic"])
    imsize = (sd_record["width"], sd_record["height"])
else:
    cam_intrinsic = None
    imsize = None

# Retrieve all sample annotations and map to sensor coordinate system.
if selected_anntokens is not None:
    boxes = list(map(nusc.get_box, selected_anntokens))
else:
    boxes = nusc.get_boxes(sample_data_token)

# Make list of Box objects including coord system transforms.
box_list = []
for box in boxes:

    # Move box to ego vehicle coord system
    box.translate(-np.array(pose_record["translation"]))
    box.rotate(Quaternion(pose_record["rotation"]).inverse)

    #  Move box to sensor coord system
    box.translate(-np.array(cs_record["translation"]))
    box.rotate(Quaternion(cs_record["rotation"]).inverse)

    box_list.append(box)

return data_path, box_list, cam_intrinsic


2. Another question is the velocity value is NaN.

V2AI / Det3D

create_data.py----Questions about gt_boxes and transform matrix #158

Instructions To Reproduce the Issue: