Closed chowkamlee81 closed 4 years ago
It depends on the device. You can check out in the log file provided
@chowkamlee81 in the log file, the total inference time is a little slow because of the CPU nms, using the gpu nms, it is 76 ms for the 0.1 voxel size model on a Titan Xp
@tianweiy Can you kindly give the pointer to the codbase to use GPU rather than CPU. Kindly help
This one is the GPU NMS taken from PointRCNN nms. Additionally, add these two methods to your det3d/core/bbox/box_torch_ops.py
def rotate_nms_pcdet(boxes, scores, thresh, pre_maxsize=None, post_max_size=None):
"""
:param boxes: (N, 5) [x1, y1, x2, y2, ry]
:param scores: (N)
:param thresh:
:return:
"""
# areas = (x2 - x1) * (y2 - y1)
order = scores.sort(0, descending=True)[1]
if pre_maxsize is not None:
order = order[:pre_maxsize]
boxes = boxes[order].contiguous()
keep = torch.LongTensor(boxes.size(0))
num_out = iou3d_nms_cuda.nms_gpu(boxes, keep, thresh)
selected = order[keep[:num_out].cuda()].contiguous()
if post_max_size is not None:
selected = selected[:post_max_size]
return selected
def boxes3d_to_bevboxes_lidar_torch(boxes3d):
"""
:param boxes3d: (N, 7) [x, y, z, w, l, h, ry] in LiDAR coords
:return:
boxes_bev: (N, 5) [x1, y1, x2, y2, ry]
"""
boxes_bev = boxes3d.new(torch.Size((boxes3d.shape[0], 5)))
cu, cv = boxes3d[:, 0], boxes3d[:, 1]
half_w, half_l = boxes3d[:, 3] / 2, boxes3d[:, 4] / 2
boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_w, cv - half_l
boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_w, cv + half_l
boxes_bev[:, 4] = boxes3d[:, -1]
return boxes_bev
Then in the det3d/models/bbox_heads/mg_head.py
change the lines 1000-1011
if top_scores.shape[0] != 0:
if test_cfg.score_threshold > 0.0:
box_preds = box_preds[top_scores_keep]
if self.use_direction_classifier:
dir_labels = dir_labels[top_scores_keep]
top_labels = top_labels[top_scores_keep]
boxes_for_nms = box_preds[:, [0, 1, 3, 4, -1]]
if not test_cfg.nms.use_rotate_nms:
box_preds_corners = box_torch_ops.center_to_corner_box2d(
boxes_for_nms[:, :2],
boxes_for_nms[:, 2:4],
boxes_for_nms[:, 4],
)
boxes_for_nms = box_torch_ops.corner_to_standup_nd(
box_preds_corners
)
# the nms in 3d detection just remove overlap boxes.
selected = nms_func(
boxes_for_nms,
top_scores,
pre_max_size=test_cfg.nms.nms_pre_max_size,
post_max_size=test_cfg.nms.nms_post_max_size,
iou_threshold=test_cfg.nms.nms_iou_threshold,
)
else:
selected = []
to
if top_scores.shape[0] != 0:
if test_cfg.score_threshold > 0.0:
box_preds = box_preds[top_scores_keep]
if self.use_direction_classifier:
dir_labels = dir_labels[top_scores_keep]
top_labels = top_labels[top_scores_keep]
# boxes_for_nms = box_preds[:, [0, 1, 3, 4, -1]]
# GPU NMS from PCDet(https://github.com/sshaoshuai/PCDet)
boxes_for_nms = box_torch_ops.boxes3d_to_bevboxes_lidar_torch(box_preds)
if not test_cfg.nms.use_rotate_nms:
box_preds_corners = box_torch_ops.center_to_corner_box2d(
boxes_for_nms[:, :2],
boxes_for_nms[:, 2:4],
boxes_for_nms[:, 4],
)
boxes_for_nms = box_torch_ops.corner_to_standup_nd(
box_preds_corners
)
# the nms in 3d detection just remove overlap boxes.
selected = box_torch_ops.rotate_nms_pcdet(boxes_for_nms, top_scores,
thresh=test_cfg.nms.nms_iou_threshold,
pre_maxsize=test_cfg.nms.nms_pre_max_size,
post_max_size=test_cfg.nms.nms_post_max_size)
You also need to add those import statements.
Additionally, if you want to wait tomorrow I will opensource my forked version of det3d which can get to 51.9 map 62.2 nds for cbgs 0.1 voxel size model with a latency of 78 ms. I will work on merge request this weekend.
I will try whatever u suggested. Also i will wait for your forked version which helps in speeding up execution time
Iused "import iou3d_cuda" instead of iou3d_nms_cuda. I observed 4ms improvement only. Awaiting for your forked version,. Kindly help
my implementation doesn't really speed up stuff... It is only more accurate because of better augmentation/hyperparameters. In my experience, a good CPU + SSD is more important for inference speed. I originally planned to post the repo now but we get to train one extra ablation model. Hopefully, it will be done by this weekend.
Iam interested mainly in speed since I feel accuracy is very much sufficient. Request ing ur help to speedize execution engine since algo is too slow. Request ing ur help on this
what inference time do you get in your setting ?
I mean If you are really interested in speed, you should consider using the point pillars model with cbgs. This can get similar performance for a large object(e.g. car) while being considerably faster
I tried with PointPillar , accuracy is not good enough.mAP is only 20.45 not more than that using this repo but speed is OK enough even with Second/Traveler59 code. Hence CBGS method accuracy is good but timngs are really too poor to take in RealVehilce environment and to test. Hope some solution wllill be there.
PointPillar Inference time on RTX2080Ti is 39ms
With CBGS : Voxelization time: 54ms Feature extraction time : 48ms NMS time : 30ms Nearly around 118-130 on ana average on RTX2080Ti
Your pp is the same as mine. However, the cbgs is too slow. My voxelization and NMS didn't take much time. Do you use 8 worker for the dataloader? Is your io/cpu too slow?
FYI that my version of pp baseline can get to 45.5 map 58.4 nds. I will release it this week and merge into this repo in the near future.
metrics_summaryjson.txt These are above results after training it for 20 epoch. mAP 20.3 and NDS score:30.
Iam surprised that using same repo you can get it 45.4mAP. Requesting you kindly upload ur codebase as it helps a lot. Kindly do the needful
For CBGS, i didn't use script given but made to run through command line with only 1 GPU
Iam not using Dataloader since it has to run in Car environment. Hence i had to process each pointcloud frame-by-frame. By using Dataloader, it deals with pytorch lib but processing frame-by-frame in car doesn't workout so i had to make seperate RS node to process bag files. Hence kindly suggest
I see. The discussion has been really long in this issue. send me an email and I will get back to you for your questions later
I will drop mail at yintianwei@utexas.edu.
Is this is correct?
Yeah, will get back to you tomorrow morning
Kindly mention about inference time for one pointcloud.