qiank10 / MVDNet

Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals, CVPR 2021.
Apache License 2.0
102 stars 16 forks source link

labels about ground-truth #2

Open Etah0409 opened 3 years ago

Etah0409 commented 3 years ago

Hi, thanks for your sharing about the bounding-box-annotations of ORR dataset. I have some questions about the annotations (such as units and meaning). Could you please add some notes? Best wishes

qiank10 commented 3 years ago

The format of the 2D labels is: Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

Etah0409 commented 3 years ago

The format of the 2D labels is: Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

Thanks for your kindly reply! It really helps a lot.

yyxr75 commented 2 years ago

Hi there, I am working on this demo recently, but come across a paradoxical problem here when drawing output result. I used pyplotlib.matplot.imshow(radar raw image), and plt.scatter(lidar point cloud) to plot radar raw image and lidar point cloud, which matched each other good as below: image In this figure, the blue boxes are label boxes,the red dots are the centers of label boxes. The red boxes are model prediction boxes and blue dots are the centers of result boxes.Yellow points are projected from point cloud and the back image is radar raw image. As u can see, When I trying to plot the result bbox and label bbox onto that figure, I found out that the label bbox center x and center y is located correctly in the up figure. But u can never move the center points of the boxes to any of the corner of the lidar point cloud with minus or plus a half of width or length of the box to accomplished both sides of the road bbox. See if its correct, I upload my result-viewing code:

import matplotlib.pyplot as plt import matplotlib.patches as patches import numpy as np import cv2 from PIL import Image import os

inputDataPath = '/data/2019-01-10-11-46-21-radar-oxford-10k/processed/'

def readLidarData(filename): lidarData = np.fromfile(filename, dtype=np.float32) lidarData = np.reshape(lidarData,(-1,4)) [pcNum,pcDim] = np.shape(lidarData) x_arr = [] y_arr = [] z_arr = [] intensity_arr = [] for i in range(pcNum):

coordinate transfer

indx_x = 319-5 (lidarData[i, 0]+32) indx_y = 5 (lidarData[i, 1]+32) indx_z = 5*(lidarData[i,2]) intensity = lidarData[i,3] if indx_x > 320 or indx_x < 0 or indx_y > 320 or indx_y < 0: continue

pack them together

x_arr.append(indx_x) y_arr.append(indx_y) z_arr.append(indx_z) intensity_arr.append(intensity) return x_arr, y_arr, z_arr, intensity_arr

def readRawIndex(): dict_temp = {} fn_evalIndex = os.path.join(inputDataPath,'ImageSets','eval.txt') file = open(fn_evalIndex) for line in file.readlines(): line = line.strip() k = line.split(' ')[0] v = line.split(' ')[1] dict_temp[k] = v return dict_temp

def readResult(): filename = './output/mvdnet/instances_predictions.csv' return np.loadtxt(filename)

def readLabel(labelFilename): this_labelFilename = os.path.join(inputDataPath, 'label_2d', labelFilename) class_arr = [] id_arr = [] leftup_x_arr = [] leftup_y_arr = [] width_arr = [] height_arr = [] angle_arr = [] with open(this_labelFilename, "r") as f: for line in f.readlines(): tmp = line.split(' ') class_arr.append(tmp[0]) id_arr.append(int(tmp[1]))

center-width/2+center_image

leftup_x = 5 * (float(tmp[2])+float(tmp[4])/2) + 319/2

leftup_y = 5 * (float(tmp[3])+float(tmp[5])/2) + 319/2

leftup_x = 5 (float(tmp[2])) + 319/2 leftup_y = 5 (float(tmp[3])) + 319/2 leftup_x_arr.append(leftup_x) leftup_y_arr.append(leftup_y) width_arr.append(5float(tmp[4])) height_arr.append(5float(tmp[5]))

angle adjust

angle = float(tmp[6])

if angle >= -112.5 and angle < 67.5:

direction = 0

else:

direction = 1

if angle < -112.5:

angle = angle + 180

else:

angle = angle - 180

angle_arr.append(angle) return class_arr,id_arr,leftup_x_arr,leftup_y_arr,width_arr,height_arr,angle_arr

def drawOutput(resultMat, rawIndexDict): [objNUm, resDim] = np.shape(resultMat) frameNum = 2799 fig = plt.figure() for i in range(objNUm):

i+=100

frameNumNow = int(resultMat[i, 0]) timestamp = rawIndexDict[str(frameNumNow)] fig.canvas.set_window_title(str(timestamp))

read raw data images

if frameNumNow != frameNum: print("frame num: {}".format(frameNum))

read lidar data

lidarFilename = '{}.bin'.format(timestamp) lidar_rawDataFilename = os.path.join(inputDataPath,'lidar',lidarFilename) [x_arr, y_arr, z_arr, intensity_arr] = readLidarData(lidar_rawDataFilename)

read radar data

radarFilename = '{}.jpg'.format(timestamp) radar_rawDataFilename = os.path.join(inputDataPath,'radar',radarFilename)

draw point cloud and radar image

plt.scatter(y_arr, x_arr,s=0.05,c='yellow') plt.imshow(Image.open(radar_rawDataFilename))

load label bbox

read label 2d

labelFilename = '{}.txt'.format(timestamp) [class_arr, id, label_leftup_x, label_leftup_y, label_width, label_height, label_angle] = readLabel(labelFilename) for j in range(len(id)): plt.gca().add_patch(patches.Rectangle((label_leftup_x[j], label_leftup_y[j]), label_width[j], label_height[j],

angle = 180*angle/np.pi,

angle=180-label_angle[j], edgecolor='blue', facecolor='none', lw=2)) plt.plot(label_leftup_x[j], label_leftup_y[j],'ro')

load result bbox

height = resultMat[i, 3] width = resultMat[i, 4] leftup_x = resultMat[i, 1]#+width/2 leftup_y = resultMat[i, 2]#+height/2 angle = resultMat[i, 5]

draw them together

if frameNumNow == frameNum: plt.gca().add_patch(patches.Rectangle((leftup_x,leftup_y),width,height,

angle = 180*angle/np.pi,

angle = 90-angle, edgecolor='red', facecolor='none', lw=2)) plt.plot(leftup_x,leftup_y,'bo') else:

plt.hold(False)

plt.pause(0.1) if frameNumNow == 2830: plt.pause(0) plt.clf() frameNum = frameNumNow

def main(): resultMat = readResult() rawDataIndexDict = readRawIndex() drawOutput(resultMat, rawDataIndexDict) pass

if name == "main": main()

I'm keen to get some help, thank u.

Sorry I may have solved this problem by doing rotation before translation. My fault, sorry.

304886938 commented 2 years ago

I want to visualize the labels too. Can you provide the code? Thank you for your help!

yyxr75 commented 2 years ago

sorry to tell you that above is all my code thanks

MaiRajborirug commented 1 year ago

@yyxr75 Could you tell me how to get instances_predictions.csv? When I run eval.py I get instances_predictions.pth. Is there a code to convert .pth to .csv?

Thank u

neverstoplearn commented 1 year ago

have you solve it? thanks.

neverstoplearn commented 1 year ago

have you solve it? thanks.

@MaiRajborirug @yyxr75

MaiRajborirug commented 1 year ago

@neverstoplearn @yyxr75 I can't find the instances_predictions.csv. So I write my own readResult function to store results from instances_predictions.pth. Let me know if you find any bugs or offset / scale adjustments in the function. My result in visualization for frame '2799' (1547121487422169.bin) looks weird. The ground truth (blue) and the red label (prediction) should be more overlapped with each other. Seem like it has the same top-left points (blue and red dots) but the code might draw the bounding boxes' width in the incorrect direction.

def readResult(filename = '../output/mvdnet/instances_predictions.pth'):
    '''Array
    frameNumNow = resultMat[i, 0] -> 2799
    leftup_x = resultMat[i, 1]#+width/2
    leftup_y = resultMat[i, 2]#+height/2
    height = resultMat[i, 3]
    width = resultMat[i, 4]
    angle = resultMat[i, 5]
    red color for prediction
    '''
    checkpoint = torch.load(filename)
    len_eval = len(checkpoint)  # number of evaluation frame
    resultMat = []
    for i in range(len_eval):
        cars = checkpoint[i]['instances']
        frameNumNow = int(checkpoint[i]['image_id'])
        for j in range(len(cars)):
            leftup_x = cars[j]['bbox'][0]#+width/2
            leftup_y = cars[j]['bbox'][1]#+height/2
            height = cars[j]['bbox'][2]
            width = cars[j]['bbox'][3]
            angle = cars[j]['bbox'][4]
            score = cars[j]['score']
            resultArr = [frameNumNow, leftup_x, leftup_y, height, width, angle, score]
            resultMat.append(resultArr)
    resultMat = np.array(resultMat)
    return resultMat

image

neverstoplearn commented 1 year ago

thanks。

Afreshbird commented 1 year ago

r kindly reply! It really

Can you correctly connect 3Dbboxs with homologous objtcts?I ran every possible coordinate transformation than I used the open3d to visualize the scene and bbox. I tried my best, but I can't connect them.😭😭 I've been stumped by this question for a long time. Can you help me?