Inference time for one pointcloud?

chowkamlee81 commented 4 years ago

Kindly mention about inference time for one pointcloud.

poodarchu commented 4 years ago

It depends on the device. You can check out in the log file provided

tianweiy commented 4 years ago

@chowkamlee81 in the log file, the total inference time is a little slow because of the CPU nms, using the gpu nms, it is 76 ms for the 0.1 voxel size model on a Titan Xp

chowkamlee81 commented 4 years ago

@tianweiy Can you kindly give the pointer to the codbase to use GPU rather than CPU. Kindly help

tianweiy commented 4 years ago

This one is the GPU NMS taken from PointRCNN nms. Additionally, add these two methods to your det3d/core/bbox/box_torch_ops.py

def rotate_nms_pcdet(boxes, scores, thresh, pre_maxsize=None, post_max_size=None):
    """
    :param boxes: (N, 5) [x1, y1, x2, y2, ry]
    :param scores: (N)
    :param thresh:
    :return:
    """
    # areas = (x2 - x1) * (y2 - y1)
    order = scores.sort(0, descending=True)[1]
    if pre_maxsize is not None:
        order = order[:pre_maxsize]

    boxes = boxes[order].contiguous()

    keep = torch.LongTensor(boxes.size(0))
    num_out = iou3d_nms_cuda.nms_gpu(boxes, keep, thresh)
    selected = order[keep[:num_out].cuda()].contiguous()

    if post_max_size is not None:
        selected = selected[:post_max_size]

    return selected 

def boxes3d_to_bevboxes_lidar_torch(boxes3d):
    """
    :param boxes3d: (N, 7) [x, y, z, w, l, h, ry] in LiDAR coords
    :return:
        boxes_bev: (N, 5) [x1, y1, x2, y2, ry]
    """
    boxes_bev = boxes3d.new(torch.Size((boxes3d.shape[0], 5)))

    cu, cv = boxes3d[:, 0], boxes3d[:, 1]

    half_w, half_l = boxes3d[:, 3] / 2, boxes3d[:, 4] / 2
    boxes_bev[:, 0], boxes_bev[:, 1] = cu - half_w, cv - half_l
    boxes_bev[:, 2], boxes_bev[:, 3] = cu + half_w, cv + half_l
    boxes_bev[:, 4] = boxes3d[:, -1]
    return boxes_bev

Then in the det3d/models/bbox_heads/mg_head.py change the lines 1000-1011

if top_scores.shape[0] != 0:
                    if test_cfg.score_threshold > 0.0:
                        box_preds = box_preds[top_scores_keep]
                        if self.use_direction_classifier:
                            dir_labels = dir_labels[top_scores_keep]
                        top_labels = top_labels[top_scores_keep]
                    boxes_for_nms = box_preds[:, [0, 1, 3, 4, -1]]
                    if not test_cfg.nms.use_rotate_nms:
                        box_preds_corners = box_torch_ops.center_to_corner_box2d(
                            boxes_for_nms[:, :2],
                            boxes_for_nms[:, 2:4],
                            boxes_for_nms[:, 4],
                        )
                        boxes_for_nms = box_torch_ops.corner_to_standup_nd(
                            box_preds_corners
                        )
                    # the nms in 3d detection just remove overlap boxes.
                    selected = nms_func(
                        boxes_for_nms,
                        top_scores,
                        pre_max_size=test_cfg.nms.nms_pre_max_size,
                        post_max_size=test_cfg.nms.nms_post_max_size,
                        iou_threshold=test_cfg.nms.nms_iou_threshold,
                    )
                else:
                    selected = []

to

                if top_scores.shape[0] != 0:
                    if test_cfg.score_threshold > 0.0:
                        box_preds = box_preds[top_scores_keep]
                        if self.use_direction_classifier:
                            dir_labels = dir_labels[top_scores_keep]
                        top_labels = top_labels[top_scores_keep]
                    # boxes_for_nms = box_preds[:, [0, 1, 3, 4, -1]]

                    # GPU NMS from PCDet(https://github.com/sshaoshuai/PCDet) 
                    boxes_for_nms = box_torch_ops.boxes3d_to_bevboxes_lidar_torch(box_preds)
                    if not test_cfg.nms.use_rotate_nms:
                        box_preds_corners = box_torch_ops.center_to_corner_box2d(
                            boxes_for_nms[:, :2],
                            boxes_for_nms[:, 2:4],
                            boxes_for_nms[:, 4],
                        )
                        boxes_for_nms = box_torch_ops.corner_to_standup_nd(
                            box_preds_corners
                        )
                    # the nms in 3d detection just remove overlap boxes.

                    selected = box_torch_ops.rotate_nms_pcdet(boxes_for_nms, top_scores, 
                                thresh=test_cfg.nms.nms_iou_threshold,
                                pre_maxsize=test_cfg.nms.nms_pre_max_size,
                                post_max_size=test_cfg.nms.nms_post_max_size)

You also need to add those import statements.

tianweiy commented 4 years ago

Additionally, if you want to wait tomorrow I will opensource my forked version of det3d which can get to 51.9 map 62.2 nds for cbgs 0.1 voxel size model with a latency of 78 ms. I will work on merge request this weekend.

chowkamlee81 commented 4 years ago

I will try whatever u suggested. Also i will wait for your forked version which helps in speeding up execution time

chowkamlee81 commented 4 years ago

Iused "import iou3d_cuda" instead of iou3d_nms_cuda. I observed 4ms improvement only. Awaiting for your forked version,. Kindly help

tianweiy commented 4 years ago

my implementation doesn't really speed up stuff... It is only more accurate because of better augmentation/hyperparameters. In my experience, a good CPU + SSD is more important for inference speed. I originally planned to post the repo now but we get to train one extra ablation model. Hopefully, it will be done by this weekend.

chowkamlee81 commented 4 years ago

Iam interested mainly in speed since I feel accuracy is very much sufficient. Request ing ur help to speedize execution engine since algo is too slow. Request ing ur help on this

tianweiy commented 4 years ago

what inference time do you get in your setting ?

tianweiy commented 4 years ago

I mean If you are really interested in speed, you should consider using the point pillars model with cbgs. This can get similar performance for a large object(e.g. car) while being considerably faster

chowkamlee81 commented 4 years ago

I tried with PointPillar , accuracy is not good enough.mAP is only 20.45 not more than that using this repo but speed is OK enough even with Second/Traveler59 code. Hence CBGS method accuracy is good but timngs are really too poor to take in RealVehilce environment and to test. Hope some solution wllill be there.

chowkamlee81 commented 4 years ago

PointPillar Inference time on RTX2080Ti is 39ms

With CBGS : Voxelization time: 54ms Feature extraction time : 48ms NMS time : 30ms Nearly around 118-130 on ana average on RTX2080Ti

tianweiy commented 4 years ago

Your pp is the same as mine. However, the cbgs is too slow. My voxelization and NMS didn't take much time. Do you use 8 worker for the dataloader? Is your io/cpu too slow?

tianweiy commented 4 years ago

FYI that my version of pp baseline can get to 45.5 map 58.4 nds. I will release it this week and merge into this repo in the near future.

chowkamlee81 commented 4 years ago

summary.pdf

metrics_summaryjson.txt These are above results after training it for 20 epoch. mAP 20.3 and NDS score:30.

Iam surprised that using same repo you can get it 45.4mAP. Requesting you kindly upload ur codebase as it helps a lot. Kindly do the needful

chowkamlee81 commented 4 years ago

For CBGS, i didn't use script given but made to run through command line with only 1 GPU

chowkamlee81 commented 4 years ago

Iam not using Dataloader since it has to run in Car environment. Hence i had to process each pointcloud frame-by-frame. By using Dataloader, it deals with pytorch lib but processing frame-by-frame in car doesn't workout so i had to make seperate RS node to process bag files. Hence kindly suggest

tianweiy commented 4 years ago

I see. The discussion has been really long in this issue. send me an email and I will get back to you for your questions later

chowkamlee81 commented 4 years ago

I will drop mail at yintianwei@utexas.edu.

Is this is correct?

tianweiy commented 4 years ago

Yeah, will get back to you tomorrow morning

V2AI / Det3D

Inference time for one pointcloud? #111