Owen-Liuyuxuan / visualDet3D

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/
Apache License 2.0
361 stars 76 forks source link

strange result #51

Closed sjg918 closed 2 years ago

sjg918 commented 2 years ago

hi. Using the result to draw a bounding box, it looks like this: 000001 000001.png 002056 002056.png Actually, there is a difference when comparing the result of yolostereo3d and the label of kitti. BUT evaluation is fine. This is evaluation result. (use the "eval.py" you provided.) This code works fine with other outputs. (ex. rtm3d, point r-cnn, etc) Even when I used someone else's evaluate code, I could get reasonable results.

+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= Car AP(Average Precision)@0.70, 0.70, 0.70: bbox AP:95.58, 79.98, 62.91 bev AP:71.71, 45.69, 34.87 3d AP:58.22, 34.54, 26.28 aos AP:87.68, 70.20, 55.01 Car AP(Average Precision)@0.70, 0.50, 0.50: bbox AP:95.58, 79.98, 62.91 bev AP:91.18, 69.85, 53.53 3d AP:88.50, 66.66, 52.34 aos AP:87.68, 70.20, 55.01

Pedestrian AP(Average Precision)@0.50, 0.50, 0.50: bbox AP:46.67, 37.24, 31.17 bev AP:28.24, 21.06, 17.15 3d AP:24.95, 18.51, 14.87 aos AP:28.57, 22.89, 19.04 Pedestrian AP(Average Precision)@0.50, 0.25, 0.25: bbox AP:46.67, 37.24, 31.17 bev AP:46.63, 36.25, 29.98 3d AP:46.45, 36.05, 29.81 aos AP:28.57, 22.89, 19.04 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=

here is my output. What's the problem? yolostereo3D_valresult.zip

sjg918 commented 2 years ago

here is the code I used for the visualization:

def compute_shortest_distance(l, w, h, x, y, z, ry):
    # compute rotational matrix around yaw axis
    c = np.cos(ry)
    s = np.sin(ry)
    R = np.array([[c, 0, s],
                        [0, 1, 0],
                        [-s, 0, c]])

    # 3d bounding box corners
    x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
    y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
    z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]

    # rotate and translate 3d bounding box
    corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners]))

    #corners_3d[0, :] = corners_3d[0, :] + x
    #corners_3d[1, :] = corners_3d[1, :] + y
    corners_3d[2, :] = corners_3d[2, :] + z

    return corners_3d[2, :].min()

def convert_cls(c):
    if c == 'Car':
        return 1
    elif c == 'Pedestrian':
        return 2
    elif c == 'Cyclist':
        return 3
    else:
        assert 'no~'

def GenResultImage_fromkitti(kittihome, kittiresultdir, resultpath):
    txt_list = os.listdir(kittiresultdir)

    for txt in txt_list:
        label_list = []
        imgnum = txt.split('.')[0]
        left_img = cv2.imread(kittihome + 'image_2/' + imgnum + '.png')
        #calib = read_calib_file(kittihome + 'calib/' + imgnum + '.txt')
        #K = calib['P2']

        with open(kittiresultdir + imgnum + '.txt', mode='r') as f:
            bbox = f.readlines()
            if len(bbox) == 0:
                pass
            else:
                bbox = [i.replace("\n", "") for i in bbox]
                for box in bbox:
                    box = box.split(' ')
                    cls, xmin, ymin, xmax, ymax = box[0], float(box[4]), float(box[5]), float(box[6]), float(box[7])
                    h, w, l = float(box[8]), float(box[9]), float(box[1])
                    x, y, z = float(box[11]), float(box[12]), float(box[13])
                    ry = float(box[14])
                    score = float(box[15]) if box.__len__() == 16 else 1.00

                    #xmin, ymin, xmax, ymax = compute_2d_bbox_from_3d_bbox(l, w, h, x, y, z, ry, K)
                    depth = compute_shortest_distance(l, w, h, x, y, z, ry)
                    label = '%d %d %d %d %d %.2f %.2f %d %.2f %.2f\n'\
                        % (
                            convert_cls(cls), int(xmin), int(ymin), int(xmax), int(ymax), depth, 
                            0, 0, 0, score
                            )
                    label_list.append(label)
                    continue

        if len(label_list) == 0:
            pass
        else:
            for obj in label_list:
                obj = obj.replace("\n", "").split(' ')
                xmin, ymin, xmax, ymax = int(obj[1]), int(obj[2]), int(obj[3]), int(obj[4])
                cv2.putText(left_img, obj[5], (xmin+1, ymin+1), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)
                cv2.rectangle(left_img, (xmin, ymin), (xmax ,ymax), (0, 255, 255), 3)
                continue
        cv2.imwrite(resultpath + imgnum + '.png', left_img)
        print(imgnum)
        continue

It works fine for other outputs.

000001 000001 002056 002056

This is the output of RTM3D. Thank you! RTM3D_1127.zip

sjg918 commented 2 years ago

sorry... please delete this... ^^;