abhi1kumar / DEVIANT

[ECCV 2022] Official PyTorch Code of DEVIANT: Depth Equivariant Network for Monocular 3D Object Detection
https://arxiv.org/abs/2207.10758
MIT License
203 stars 29 forks source link

Run on raw live video #1

Closed zillur-av closed 1 year ago

zillur-av commented 2 years ago

I downloaded the pre-trained weighs. I would like to use the kitti weights to run on my raw video/webcam and get the output with 3d box and bird's eye view. How can I do that? Also, how do I include the my extrinsic and intrinsic camera calibration parameters?

abhi1kumar commented 2 years ago

Hi @Zillurcuet

I downloaded the pre-trained weighs. I would like to use the kitti weights to run on my raw video/webcam and get the output with 3d box and bird's eye view. How can I do that?

I have not tried this, but here are the steps to do it for other images:

├── data
│      ├── KITTI
│      │      ├── ImageSets
│      │      ├── kitti_split1
│      │      └── testing
│      │            ├── calib
│      │            └── image_2
│      │

You need to chain the above four steps in a loop for a raw video. Please feel free to contribute the demo code to our repo by opening a pull request.

Note: Mono3D models (unlike 2D detection) are sensitive to the dataset and do not perform well if tested on another dataset. See Fig. 14 and Tab 6 of our paper

Also, how do I include the my extrinsic and intrinsic camera calibration parameters?

That goes inside the text files of the calib folder. P2 is the 3x4 camera calibration matrix and is the product of camera intrinsics and extrinsics matrix.

P2: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 4.575831000000e+01 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 -3.454157000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 4.981016000000e-03