[Evaluation][3D Object] 3D object detection performance evaluation

Durant35 commented 6 years ago

[ ] 实现成一个evaluation ROS node

参考KiTTI 3D object detection performance evaluation 的方法，采用 AP (Average Precision) 定量分析

#Values    Name      Description
----------------------------------------------------------------------------
3    dimensions   3D object dimensions: height, width, length (in meters)
3    location     3D object location x,y,z in camera coordinates (in meters)
1    rotation_y   Rotation ry around Y-axis in camera coordinates [-pi..pi]

Refer

Durant35 commented 6 years ago

AP计算由来

Let us understand how all this makes sense. Lets say you built an algorithm that is extremely efficient at detecting the back of the cars. Such an algorithm has near 100% precision because all cars detected are present in the image. But this algorithm is really bad at detecting cars with side-view and should be penalized. This is where recall comes in. We should evaluate not just the overall precision but instead precision for different recall values. In the example we considered, the algorithm has low recall because it fails to detect all cars. Lets say the recall r=0.2 in this example. Now consider the equation for AP. The first two bins r=0, 0.1 give a near 100% interpolated precision in this example but the bins following r=0.2 have very low interpolated precision. Averaging all the recall bins, we find that AP is low in this example. To summarize, our model has to give a good precision for all recall values to get a high AP score.

Durant35 commented 6 years ago

AP 计算公式


vector<double> getThresholds(vector<double>& v, double n_groundtruth)
{

    // holds scores needed to compute N_SAMPLE_PTS recall values
    vector<double> t;

    // sort scores in descending order
    // (highest score is assumed to give best/most confident detections)
    sort(v.begin(), v.end(), greater<double>());

    // get scores for linearly spaced recall
    double current_recall = 0;
    for (int32_t i = 0; i < v.size(); i++) {

        // check if right-hand-side recall with respect to current recall is close than left-hand-side one
        // in this case, skip the current detection score
        double l_recall, r_recall, recall;
        l_recall = (double) (i + 1) / n_groundtruth;
        if (i < (v.size() - 1))
            r_recall = (double) (i + 2) / n_groundtruth;
        else
            r_recall = l_recall;

        if ((r_recall - current_recall) < (current_recall - l_recall) && i < (v.size() - 1))
            continue;

        // left recall is the best approximation, so use this and goto next recall step for approximation
        recall = l_recall;

        // the next recall step was reached
        t.push_back(v[i]);
        current_recall += 1.0 / (N_SAMPLE_PTS - 1.0);
    }
    return t;
}

...

    // get scores that must be evaluated for recall discretization
    thresholds = getThresholds(v, n_gt);
...
    // iterate on every frame of data
    for (int32_t i = 0; i < groundtruth.size(); i++) {

        // for all scores/recall thresholds do:
        for (int32_t t = 0; t < thresholds.size(); t++) {
            tPrData tmp = tPrData();
            tmp = computeStatistics(current_class, groundtruth[i], detections[i], dontcare[i],
                                    ignored_gt[i], ignored_det[i], true, boxoverlap, metric,
                                    compute_aos, thresholds[t], t == 38);

            // add no. of TP, FP, FN, AOS for current frame to total evaluation for current threshold
            pr[t].tp += tmp.tp;
            pr[t].fp += tmp.fp;
            pr[t].fn += tmp.fn;
            if (tmp.similarity != -1)
                pr[t].similarity += tmp.similarity;
        }
    }

    // compute recall, precision and AOS
    vector<double> recall;
    precision.assign(N_SAMPLE_PTS, 0);
    if (compute_aos)
        aos.assign(N_SAMPLE_PTS, 0);
    double r = 0;
    for (int32_t i = 0; i < thresholds.size(); i++) {
        r = pr[i].tp / (double) (pr[i].tp + pr[i].fn);
        recall.push_back(r);
        precision[i] = pr[i].tp / (double) (pr[i].tp + pr[i].fp);
        if (compute_aos)
            aos[i] = pr[i].similarity / (double) (pr[i].tp + pr[i].fp);
    }

    // filter precision and AOS using max_{i..end}(precision)
    for (int32_t i = 0; i < thresholds.size(); i++) {
        precision[i] = *max_element(precision.begin() + i, precision.end());
        if (compute_aos)
            aos[i] = *max_element(aos.begin() + i, aos.end());
    }

Durant35 commented 6 years ago

借助Ground Truth's IoU Overlap计算TP、FP

For cars we require an 3D bounding box overlap of 70%, while for pedestrians and cyclists we require a 3D bounding box overlap of 50%.

我们统一采用50%判定TP、FP
max_{i..end}(precision) 并计算均值

Durant35 commented 6 years ago

AutoLidarPerception / kitti_ros

[Evaluation][3D Object] 3D object detection performance evaluation #18

Refer

AP计算由来

AP 计算公式

借助Ground Truth's IoU Overlap计算TP、FP