Bounding boxes offset - Githubissues

MathisMM commented 2 months ago

Hi,

I'm using CRN as a detection backbone on the nuScenes mini-dataset, and while visualizing the bounding boxes I've been seeing an offset for every detection (see image).

n008-2018-08-01-15-16-36-0400__CAM_FRONT__1533151607112404

I've made sure my visualization pipeline works by : 1) using the same code for another detection method 2) using the ground truth, which are already in the camera frame but I transformed them in the global frame first and then ran the same visualization code to get the boxes back in the camera frame.

For these 2 methods, the boxes are placed correctly.

Woud have any insight as to why CRN behaves this way ? (I'm using the r50 backbone)

Thank you for your time and help.

adeelajmal2468 commented 2 months ago

How did you inference?

MathisMM commented 2 months ago

@adeelajmal2468 I'm not sure I understand your question but if you're asking how I ran the CRN model and displayed the bounding boxes, here it is : For CRN I'm just following the README from this repo, and then using : $ python exps/det/CRN_r50_256x704_128x128_4key.py --ckpt_path checkpoint/CRN_r50_256x704_128x128_4key.pth -e -b 1 --gpus 1 to run the data on the model itself.

From there I get a .json file containing all the bounding boxes for every token in the dataset. to load the dataset I following nuScenes recommendations and start by loading a scene, from which I get the first token. I parse the json file for any token that corresponds to the first token and if there is I : 1) Extract the bounding box from the json file 2) Translate and rotate the box to the ego_pose coordinate system 3) Translate and rotate the box to the sensor (CAM_FRONT) coordinate system 4) Use nuScenes' code (render_cv2) to add the box to the image corresponding to this token and this sensor.

I repeat the process for the next frame until there are no more.

I get the rotation and translation matrices from the sample metadata. Again, those work fine with another detector and with the ground truth (if I first transform those bounding boxes in the global frame).

loserhou commented 1 month ago

hi I also have the same question,have you finished it?

loserhou commented 1 month ago

@adeelajmal2468 I'm not sure I understand your question but if you're asking how I ran the CRN model and displayed the bounding boxes, here it is : For CRN I'm just following the README from this repo, and then using : $ python exps/det/CRN_r50_256x704_128x128_4key.py --ckpt_path checkpoint/CRN_r50_256x704_128x128_4key.pth -e -b 1 --gpus 1 to run the data on the model itself.

From there I get a .json file containing all the bounding boxes for every token in the dataset. to load the dataset I following nuScenes recommendations and start by loading a scene, from which I get the first token. I parse the json file for any token that corresponds to the first token and if there is I : 1) Extract the bounding box from the json file 2) Translate and rotate the box to the ego_pose coordinate system 3) Translate and rotate the box to the sensor (CAM_FRONT) coordinate system 4) Use nuScenes' code (render_cv2) to add the box to the image corresponding to this token and this sensor.

I repeat the process for the next frame until there are no more.

I get the rotation and translation matrices from the sample metadata. Again, those work fine with another detector and with the ground truth (if I first transform those bounding boxes in the global frame).

have you finished it?

adeelajmal2468 commented 1 month ago

Not yet

adeelajmal2468 commented 1 month ago

@loserhou Just ran the training code, evaluated and predicted it. Just the inference is left. Which means to load the weights and config file and run it just like shown in the video in the github directory

loserhou commented 1 month ago

run

can you reproduce r50? i use 4 a800 and bs＝8 mAP: 0.4681 mATE: 0.5202 mASE: 0.2827 mAOE: 0.5189 mAVE: 0.2834 mAAE: 0.1873 NDS: 0.5548 Eval time: 111.6s

adeelajmal2468 commented 1 month ago

@loserhou I got these results by using batch size 4 and 1 gpu

MathisMM commented 1 month ago

@adeelajmal2468 I'm not sure I understand your question but if you're asking how I ran the CRN model and displayed the bounding boxes, here it is : For CRN I'm just following the README from this repo, and then using : $ python exps/det/CRN_r50_256x704_128x128_4key.py --ckpt_path checkpoint/CRN_r50_256x704_128x128_4key.pth -e -b 1 --gpus 1 to run the data on the model itself. From there I get a .json file containing all the bounding boxes for every token in the dataset. to load the dataset I following nuScenes recommendations and start by loading a scene, from which I get the first token. I parse the json file for any token that corresponds to the first token and if there is I :

Extract the bounding box from the json file

Translate and rotate the box to the ego_pose coordinate system

Translate and rotate the box to the sensor (CAM_FRONT) coordinate system

Use nuScenes' code (render_cv2) to add the box to the image corresponding to this token and this sensor.

I repeat the process for the next frame until there are no more. I get the rotation and translation matrices from the sample metadata. Again, those work fine with another detector and with the ground truth (if I first transform those bounding boxes in the global frame).

have you finished it?

Hi,

Obviously I have finished the visualization code, if this was your question, since I'm showing examples of it. However I haven't tried inferencing on new data.

I have the same results for the (full) nuScenes validation dataset than those shown in the github for r50 (which are not the same as those in the paper). However I still have my visualization offset issue.

Please try visualizing the output bounding boxes @loserhou and @adeelajmal2468 and tell me if you have the same issue. If not, please open a new issue and keep this one for the visualization issue.

Thank you kindly.

adeelajmal2468 commented 1 month ago

@MathisMM Can you provide the script that you have used for visualization of bounding boxes?

youngskkim commented 1 month ago

Can you check the origin of the bounding box? At some point, the mmdet3d code was updated and the reference point of the bounding box changed from bottom center to 3D center. I think translating the bounding box upwards by h/2 will give you accurate results.

MathisMM commented 1 month ago

Hi @youngskkim Thank you for taking the time to reply ! That worked great ! I would not have figured it out on my own so thanks a lot.

Best regards.

youngskkim / CRN

Bounding boxes offset #16