imatge-upc / detection-2016-nipsws

Hierarchical Object Detection with Deep Reinforcement Learning
http://imatge-upc.github.io/detection-2016-nipsws/
MIT License
423 stars 129 forks source link

Precision recall curve problems #13

Closed Anida-qin closed 7 years ago

Anida-qin commented 7 years ago

Hi, I use the metric standard like you said but i still can not get the curve like you did. I don't know what's wrong with it and maybe i misunderstood something. Here is my metrics code iou_list means each proposal generated from each action from one image, gt_masks is the same one in your code, and confidence is the qval of each action from one image. precision_recall_per_img is the list of p-r value of size 50.

`def precision_recall_curve_threhold(iou_list,gt_masks,confidence): threhold_list = np.arange(0,5,0.1) precision_recall_per_img = [] TP_FN = float(gt_masks.shape[2]) a = zip(confidence, range(len(confidence))) a.sort(key=lambda x: x[0]) index = [x[1] for x in a[::-1]]

for i_th in range(len(threhold_list)):
    proposal_num = 0
    flag = [0] * int(TP_FN)
    cls = [-1] * len(index)
    # caculate proposal_num
    for i in index:
        if confidence[i] > threhold_list[i_th]:
            proposal_num += 1
        else:
            break
    # caculate per threhold
    # object
    if proposal_num!= 0:
        for i_ob in range(int(TP_FN)):
            max=0
            max_id = -1
            # proposal
            for j in index[:proposal_num]:
                if cls[j] != 1:
                    cls[j] = 0
                if iou_list[j][i_ob]> 0.5 and iou_list[j]>max:
                    max = iou_list[j][i_ob]
                    max_id = j
                if max_id != -1:
                    cls[max_id] = 1
                    flag[i_ob] = 1
        TP = float(cls.count(1))
        FP = float(cls.count(0))
        TP_r = float(flag.count(1))
        precision = TP / (TP + FP)
        recall = TP_r / TP_FN
        precision_recall_per_img.append((precision, recall))
    else:
        precision_recall_per_img.append((-1,-1))

return precision_recall_per_img`

When each time i get the whole test precision_recall value on the one test image. I sum the precision and recall of all test images on each threshold.

for i in range(50):
                if precision_recall_c_new[i][0]!=-1:
                    sum_p[i] += precision_recall_c_new[i][0]
                    sum_r[i] += precision_recall_c_new[i][1]
                    count[i] += 1

And then do mean like this:

 for i in range(50):
        if count[i]!=0:
            f1.write(str([float(sum_p[i])/count[i],float(sum_r[i])/count[i]]))
            f1.write('\n')

Am i wrong in calculating curve ?

miriambellver commented 7 years ago

Hi!

The confidence of each bounding box that I considered is not the qvalue for the the action that generated that bb, but the qvalue of the trigger action in that moment. I don't know if you are doing this.

Miriam

Anida-qin commented 7 years ago

Hi ! Sorry for replying late. I tried many times. At first, I use wrong confidence, then i trans to qval of trigger action in that moment. It improved. But still not ideal. I wonder to know if you take all iou>0.5 as TP or one ground truth only has one proposal and take that as tp. Here is the precision-recall text i got. The first line is precision, and the second is recall. The threshold is increasing from top to bottom. [0.22499416433239963, 0.2314542483660131] [0.22682072829131653, 0.2265522875816994] [0.221260458206271, 0.22151067323481122] [0.22318398623817343, 0.22151067323481122] [0.2238075297120523, 0.2209380234505863] [0.26063829787234044, 0.22056737588652484] [0.25732600732600736, 0.21547619047619052] [0.2740963855421687, 0.21696787148594376] [0.2901678657074341, 0.2267386091127098] [0.3089080459770115, 0.23836206896551726] [0.3157894736842105, 0.25771929824561407] [0.3227848101265822, 0.2550632911392404] [0.32575757575757575, 0.26111111111111107] [0.3157894736842105, 0.2570175438596491] [0.30851063829787234, 0.2457446808510638] [0.3125, 0.23874999999999996] [0.3088235294117647, 0.23676470588235296] [0.35714285714285715, 0.26964285714285713] [0.375, 0.27291666666666664] [0.36363636363636365, 0.28863636363636364] [0.3333333333333333, 0.24166666666666664] [0.25, 0.146875] [0.2857142857142857, 0.16785714285714287] [0.4444444444444444, 0.2611111111111111] [0.5, 0.29375] [0.5, 0.29375] [0.42857142857142855, 0.19285714285714287] [0.25, 0.0625] [0.25, 0.0625] [0.3333333333333333, 0.08333333333333333] [0.0, 0.0] [0.0, 0.0]

miriambellver commented 7 years ago

From all the regions selected during one sequence, I analyzed if any of them had more than 0.5 intersection over union with any object of the ground truth, and that is a TP, so one gt object only has one TP.