Closed sarimmehdi closed 4 years ago
Hi @sarimmehdi, Do you mean the yaw angle? The yaw direction is predicted in a range of [-pi/4, 3pi/4] as they are sufficient to generate 3D bounding boxes. All yaw angles are shifted to that range by adding/subtracting pi. Inside the network, we treat the side-view object [-pi/4, pi/4] and front/back-view object [pi/4, 3pi/4] as two separate classes. If you want to predict yaw angle in [-pi/4, 7pi/4] range for your application, you can add two more prediction headers for [3pi/, 5pi] and [5pi, 7pi]. Hope this is helpful. Thanks,
Hi @WeijingShi Can you tell me where I need to make the changes (the name of the file where I need to make the change) so that the yaw is output to be between -pi and pi (as is the case in the official KITTI labels). Thanks
Hi @sarimmehdi , The bounding box labels are split and assigned to each point here:
https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/dataset/kitti_dataset.py#L1184
Then each point's label is encoded here:
https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/models/box_encoding.py#L231
You can add a new split method which outputs [-pi/4, pi/4], [pi/4, 3pi/4], [3pi/4, 5pi/4], [5pi/4, 7pi/4] four subclasses. And center their yaw value during encoding by subtracting 0, pi/2, pi, 3pi/2. (Just one choice, you can try your preferred method). And use them in training and evaluation:
https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/train.py#L70
https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/train.py#L113
Hello. Sorry for the late reply. I am unable to follow your instructions. Can you show me what you mean through a simple code segment that shows where I need to make the change? Particularly, I wish to know where to add this new split method you just mentioned as I see no mention of angle values
EDIT: So, I obtained the angle values as they were from your neural net. Now, exactly what angle value should I add to the predicted rotation value to bring it from your custom range to the official KITTI range which is between pi and -pi.
Hi @sarimmehdi The variable "yaw" in the code is the bounding box angle that you are looking for. To predict yaw value at [0, 2pi], you need to modify the code and retrain the network. One way to do that is to have four prediction headers, which handle objects with yaw angle in [-pi/4, pi/4], [pi/4, 3pi/4], [3pi/4, 5pi/4], [5pi/4, 7pi/4] respectively.
There are two main modifications that you need. The first is to change the training labels. In the following assign_classaware_car_label_to_points method, bounding boxes are separated into [-pi/4, pi/4] and [pi/4, 3pi/4] two categories. You need to change it to [-pi/4, pi/4], [pi/4, 3pi/4], [3pi/4, 5pi/4], [5pi/4, 7pi/4] four categories. https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/dataset/kitti_dataset.py#L1184
The second modification is on the encoding method. In the following classaware_all_class_box_encoding method, training labels are encoded. Specifically, the yaw angles in each category are centered. As you now have four yaw categories [-pi/4, pi/4], [pi/4, 3pi/4], [3pi/4, 5pi/4], [5pi/4, 7pi/4], you need to subtract 0, pi/2, pi, 3pi/2 respectively. https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/models/box_encoding.py#L231 The decoding method needs to be changed accordingly. https://github.com/WeijingShi/Point-GNN/blob/2baf24f9556907f23e2e4018f1b756dac3f6c497/models/box_encoding.py#L265
If you add new methods instead of modifying the existing ones, make sure you are using the new methods in the train.py and run.py.
After retraining the network, if you need [-pi, pi], you can simply np.mod(yaw+pi, 2pi)-pi.
Is there any reason why you are not regressing to angles between -pi and pi? This probably has a bad effect on your accuracy when you submit to the kitti leaderboard
Hi @sarimmehdi, The mAP computes overlap area between the predicted bounding box and ground truth bounding box. Changing a box's yaw angle to yaw-pi would not affect its overlapping. Therefore, to make network simple, we just use two headers [-0.25pi, 0.25pi] and [0.25pi, 0.75pi] instead of four headers.
Since you are using your own orientation values, can you provide a script that allows us to draw the bounding boxes in the right way on the image plane? This would be much more useful than trying to retrain the entire network from scratch so that it can output angle values in the correct range.
As of now, this is the script I use to draw 3D bounding boxes and it requires angles in the range of -pi and pi:
def plot_3d_bbox(img, calib, bbox3d_center, bbox3d_dims, bbox3d_roty):
box_3d = []
box_pts = []
h, w, l = bbox3d_dims
p0, p1, p2, p3 = np.array([l/2,0,w/2]), np.array([-l/2,0,w/2]),
np.array([-l/2,0,-w/2]), np.array([l/2,0,-w/2])
p4, p5, p6, p7 = np.array([l/2,-h,w/2]), np.array([-l/2,-h,w/2]),
np.array([-l/2,-h,-w/2]), np.array([l/2,-h,-w/2])
pts_array = np.array([p0, p1, p2, p3, p4, p5, p6, p7]).transpose()
rot_mat = np.array([[cos(bbox3d_roty), 0, sin(bbox3d_roty)],[0, 1, 0],[-sin(bbox3d_roty), 0, cos(bbox3d_roty)]])
pts_array = np.matmul(rot_mat, pts_array).transpose()
for pt_array in pts_array:
box_pts.append(np.append(pt_array+bbox3d_center, 1))
box_3d.append(get_img_pt(np.append(pt_array+bbox3d_center, 1), calib))
for i in [0,1,2,3]:
pt1, pt2, pt3, pt4 = box_3d[i%4], box_3d[(i+1)%4], box_3d[(i+4)%8], box_3d[(i+5)%8]
pt5, pt6 = box_3d[(i%4)+4], box_3d[((i+1)%4)+4]
cv2.line(img, pt1, pt2, (0, 0, 255), 1)
cv2.line(img, pt1, pt3, (0, 0, 255), 1)
cv2.line(img, pt2, pt4, (0, 0, 255), 1)
cv2.line(img, pt5, pt6, (0, 0, 255), 1)
# draw two intersecting lines on the front-face of the 3d bbox
cv2.line(img, box_3d[0], box_3d[-1], (0, 0, 255), 1)
cv2.line(img, box_3d[3], box_3d[4], (0, 0, 255), 1)
center_pt_img = get_img_pt(np.append(bbox3d_center, 1), calib)
cv2.circle(img, center_pt_img, 3, (255, 255, 255), -1)
return box_pts
def get_img_pt(pt, calib):
projected_point = np.dot(calib['P2'], pt)
projected_point = projected_point[:2] / projected_point[2]
projected_point = projected_point.astype(np.int16)
return (projected_point[0], projected_point[1])
Maybe you can suggest to me what changes I need to make to draw bounding boxes so that they face in the right direction according to your angles?
If you want the correct direction, you need a yaw angle within [0, 2pi]. The pre-trained model is outputting half the range ([-0.25pi, 0.75pi]). Drawing using the output would flip the direction. You can change the network output to [0, 2pi] by doubling the predict header and retrain as we discussed.
Hello. I notice that your neural net always outputs a positive angle value and rarely gives the negative angle value. Is there any reason for your neural net not being able to regress to the right angle value?