WeijingShi / Point-GNN

Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud, CVPR 2020.
MIT License
523 stars 114 forks source link

train_config question #61

Closed typhoonlee closed 3 years ago

typhoonlee commented 3 years ago

Hi,sorry to bother you,Your work is very helpful to me, I have been studying carefully. What does expend_factor mean in train_config? I noticed that this parameter is used in data enhancement and class label assignment.

WeijingShi commented 3 years ago

Hi @typhoonlee, thanks for your interest. The "expend_factor" is a misspelling of "expand_factor". We use it to resize the bounding box from the label files. We found the annotated bounding boxes sometimes are too tight and they crop the actual object. It induces errors when we use the box to select points to assign the object label. So we optionally use the "expand_factor" to make the boxes larger to select all points. For example, expand_factor=(1.2, 1.1, 1.0) means points are selected by120% box height, 110% box width, and the same box length.

https://github.com/WeijingShi/Point-GNN/blob/48f3d79d5b101d3a4b8439ba74c92fcad4f7cab0/dataset/kitti_dataset.py#L85

Edit: we also use the "expand_factor" to select points in data augmentation, as we want all points of the object are selected and moved.

Hope it helps.

jiaminglei-lei commented 3 years ago

Hello @WeijingShi In the function def box3d_to_cam_points(label, expend_factor=(1.0, 1.0, 1.0)): you mentioned:

corners = np.array([[ l/2,  delta_h/2,  w/2],  # front up right
                        [ l/2,  delta_h/2, -w/2],  # front up left
                        [-l/2,  delta_h/2, -w/2],  # back up left
                        [-l/2,  delta_h/2,  w/2],  # back up right
                        [ l/2, -h-delta_h/2,  w/2],  # front down right
                        [ l/2, -h-delta_h/2, -w/2],  # front down left
                        [-l/2, -h-delta_h/2, -w/2],  # back down left
                        [-l/2, -h-delta_h/2,  w/2]]) # back down right
  1. Are these corners coordinates in object coordinates? And the object coordinate system is: x = right, y = up, z = forward ? And then
    r_corners = corners.dot(np.transpose(R))
  2. this sentence project them into camera coordinates? These values actually are the offsets respectively.
  3. After projecting them into camera coordinates, add the offset to the original object center coordinates, and get the new corners coordinates?
  4. But I'm confused. After projecting, the xyz coordinates in object coordinates does not match the xyz in camera coordinates? Could you tell me where the problem is?
typhoonlee commented 3 years ago

Hi @typhoonlee, thanks for your interest. The "expend_factor" is a misspelling of "expand_factor". We use it to resize the bounding box from the label files. We found the annotated bounding boxes sometimes are too tight and they crop the actual object. It induces errors when we use the box to select points to assign the object label. So we optionally use the "expand_factor" to make the boxes larger to select all points. For example, expand_factor=(1.2, 1.1, 1.0) means points are selected by120% box height, 110% box width, and the same box length.

https://github.com/WeijingShi/Point-GNN/blob/48f3d79d5b101d3a4b8439ba74c92fcad4f7cab0/dataset/kitti_dataset.py#L85

Edit: we also use the "expand_factor" to select points in data augmentation, as we want all points of the object are selected and moved.

Hope it helps.

Thank U very much!

WeijingShi commented 3 years ago

Hello @WeijingShi In the function def box3d_to_cam_points(label, expend_factor=(1.0, 1.0, 1.0)): you mentioned:

corners = np.array([[ l/2,  delta_h/2,  w/2],  # front up right
                        [ l/2,  delta_h/2, -w/2],  # front up left
                        [-l/2,  delta_h/2, -w/2],  # back up left
                        [-l/2,  delta_h/2,  w/2],  # back up right
                        [ l/2, -h-delta_h/2,  w/2],  # front down right
                        [ l/2, -h-delta_h/2, -w/2],  # front down left
                        [-l/2, -h-delta_h/2, -w/2],  # back down left
                        [-l/2, -h-delta_h/2,  w/2]]) # back down right
  1. Are these corners coordinates in object coordinates? And the object coordinate system is: x = right, y = up, z = forward ? And then
r_corners = corners.dot(np.transpose(R))
  1. this sentence project them into camera coordinates? These values actually are the offsets respectively.
  2. After projecting them into camera coordinates, add the offset to the original object center coordinates, and get the new corners coordinates?
  3. But I'm confused. After projecting, the xyz coordinates in object coordinates does not match the xyz in camera coordinates? Could you tell me where the problem is?

Hi @as3382246,

  1. The corners are in the object frame: x is the object length direction, y is the object height direction, z is the object width direction. Although not quite clear on the "x = right, y = up, z = forward" notion, I think they are incorrect. If a car's heading is "forward", it should be the x axis.
  2. and 3. 4. The corners are 3d and in the object frame. We transform them to the camera frame by rotation (r_corners = corners.dot(R.T)) and translation (r_corners+np.array([tx, ty, tz])). This should give the correct corner coordinates in the camera frame. Let me know if it works. Thanks,
jiaminglei-lei commented 3 years ago

Hello @WeijingShi In the function def box3d_to_cam_points(label, expend_factor=(1.0, 1.0, 1.0)): you mentioned:

corners = np.array([[ l/2,  delta_h/2,  w/2],  # front up right
                        [ l/2,  delta_h/2, -w/2],  # front up left
                        [-l/2,  delta_h/2, -w/2],  # back up left
                        [-l/2,  delta_h/2,  w/2],  # back up right
                        [ l/2, -h-delta_h/2,  w/2],  # front down right
                        [ l/2, -h-delta_h/2, -w/2],  # front down left
                        [-l/2, -h-delta_h/2, -w/2],  # back down left
                        [-l/2, -h-delta_h/2,  w/2]]) # back down right
  1. Are these corners coordinates in object coordinates? And the object coordinate system is: x = right, y = up, z = forward ? And then
r_corners = corners.dot(np.transpose(R))
  1. this sentence project them into camera coordinates? These values actually are the offsets respectively.
  2. After projecting them into camera coordinates, add the offset to the original object center coordinates, and get the new corners coordinates?
  3. But I'm confused. After projecting, the xyz coordinates in object coordinates does not match the xyz in camera coordinates? Could you tell me where the problem is?

Hi @as3382246,

  1. The corners are in the object frame: x is the object length direction, y is the object height direction, z is the object width direction. Although not quite clear on the "x = right, y = up, z = forward" notion, I think they are incorrect. If a car's heading is "forward", it should be the x axis.
  2. and 3. 4. The corners are 3d and in the object frame. We transform them to the camera frame by rotation (r_corners = corners.dot(R.T)) and translation (r_corners+np.array([tx, ty, tz])). This should give the correct corner coordinates in the camera frame. Let me know if it works. Thanks,

Hi @WeijingShi . I got it. You're right. I've got the wrong idea about the coordinate rotation, translation and transform. Thanks for your kindly help!!!

WeijingShi commented 3 years ago

Glad it works!