SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
MIT License
283 stars 29 forks source link

How to train imvoxelnet on custom dataset #70

Closed assia855 closed 1 year ago

assia855 commented 1 year ago

Thank you for your hard work. I would like to train imvoxelnet on my own dataset. My dataset is just 2D multi-view of my intereset object and I would like to detect it in 3D, is it possible or I should have as well the point cloud of each frame like kitti dataset.

filaPro commented 1 year ago

Hi @assia855 , We don't use depth or point cloud in ImVoxelNet. You only need 2d images, their poses and ground truth boxes in world coordinate system.

assia855 commented 1 year ago

Hi @filaPro thanks for answering. yes I have 2d images and their poses but about ground truth boxes in world coordinate system. do you mean label or something else please? for the configuration file and the checkpoint which one I should use in my case?. Thanks in advance.

filaPro commented 1 year ago

I mean if you are going to train on your data, you need boxes given by 7 or 6 numbers (x_center, y_center, z_center, width, length, height, heading_angle) and a class label. For indoor scenes you can basically follow imvoxelnet_sunrgbd_fast.py config.

assia855 commented 1 year ago

can you suggest a tool for annotation to have this type of annotation please. I searched by I didn't find a free tool to have a similar output annotation. Thanks in advance.

filaPro commented 1 year ago

If you can reconstruct point clouds from your data (just for annotation) you can use any point cloud annotation tool like this https://3d-on-3d.annotate.photo/ . Otherwise it's too hard to annotate 3d boxes, and you can look at smth like https://github.com/google-research-datasets/Objectron .