SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
MIT License
273 stars 29 forks source link

How to train with sunrgbd data set #64

Closed ToTensor closed 1 year ago

ToTensor commented 1 year ago

What steps should be taken to train the indoor sunrgbd data set?

filaPro commented 1 year ago

Just follow our readme for installation, data preprocessing and training command?

ToTensor commented 1 year ago

When I process the sunrgbd data set, runpython tools/create_data.py sunrgbd --root-path ./data/sunrgbd --out-dir ./data/sunrgbd --extra-tag sunrgbdReport an error。 Traceback (most recent call last): File "tools/create_data.py", line 4, in <module> from data_converter import indoor_converter as indoor File "/home/ly/imvoxelnet-master/tools/data_converter/indoor_converter.py", line 5, in <module> from tools.data_converter.scannet_data_utils import ScanNetData ModuleNotFoundError: No module named 'tools.data_converter' I tried many ways but couldn't solve it. Could you please help me

filaPro commented 1 year ago

But data_converter actually is presented in tools. May be it's something about your installation. Can you please try PYTHONPATH=./ python tools/create_data.py ...?

ToTensor commented 1 year ago

Excuse mePYTHONPATH=./ python tools/create_data.py ...Where to add?I'm falling apart

ToTensor commented 1 year ago

Excuse mePYTHONPATH=./ python tools/create_data.py ...Where to add?I'm falling apart

filaPro commented 1 year ago

You said you have and error during running python tools/create_data.py ... so you can try PYTHONPATH=./ python tools/create_data.py ....

Btw you can try running ImVoxelNet on SUN RGB-D in the original mmdetection3d repo here.

ToTensor commented 1 year ago

thanks,i'll try

ToTensor commented 1 year ago

Thank you for your reply. The problem with the data set has been solved. Can this algorithm be deployed on the edge computing device NVIDIA Jetson Xavier NX

ToTensor commented 1 year ago

`2023-03-13 16:23:19,100 - mmdet3d - INFO - Environment info:

sys.platform: linux Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /home/ly/cuda-11.5 NVCC: Cuda compilation tools, release 11.5, V11.5.50 GCC: gcc (Ubuntu 7.5.0-6ubuntu2) 7.5.0 PyTorch: 1.7.1+cu110 PyTorch compiling details: PyTorch built with:

TorchVision: 0.8.2+cu110 OpenCV: 4.7.0 MMCV: 1.7.0 MMCV Compiler: GCC 7.5 MMCV CUDA Compiler: not available MMDetection: 2.27.0 MMSegmentation: 0.30.0 MMDetection3D: 1.0.0rc6+ spconv2.0: False

2023-03-13 16:23:19,100 - mmdet3d - INFO - Distributed training: False 2023-03-13 16:23:19,450 - mmdet3d - INFO - Config: model = dict( type='ImVoxelNet', pretrained='torchvision://resnet50', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=64, num_outs=4), neck_3d=dict( type='ImVoxelNeck', channels=[64, 128, 256, 512], out_channels=64, down_layers=[1, 2, 3, 4], up_layers=[3, 2, 1], conditional=False), bbox_head=dict( type='SunRgbdImVoxelHead', n_classes=10, n_channels=64, n_convs=0, n_reg_outs=7), n_voxels=(80, 80, 32), voxel_size=(0.08, 0.08, 0.08)) train_cfg = dict() test_cfg = dict( nms_pre=1000, nms_thr=0.15, use_rotate_nms=True, score_thr=0.05) img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) dataset_type = 'SunRgbdMultiViewDataset' data_root = 'data/sunrgbd/' class_names = ('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub') train_pipeline = [ dict(type='LoadAnnotations3D'), dict( type='MultiViewPipeline', n_images=1, transforms=[ dict(type='LoadImageFromFile'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=[(512, 384), (768, 576)], multiscale_mode='range', keep_ratio=True), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32) ]), dict(type='SunRgbdRandomFlip'), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub')), dict(type='Collect3D', keys=['img', 'gt_bboxes_3d', 'gt_labels_3d']) ] test_pipeline = [ dict( type='MultiViewPipeline', n_images=1, transforms=[ dict(type='LoadImageFromFile'), dict(type='Resize', img_scale=(640, 480), keep_ratio=True), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32) ]), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['img']) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type='RepeatDataset', times=2, dataset=dict( type='SunRgbdMultiViewDataset', data_root='data/sunrgbd/', ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_train.pkl', pipeline=[ dict(type='LoadAnnotations3D'), dict( type='MultiViewPipeline', n_images=1, transforms=[ dict(type='LoadImageFromFile'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=[(512, 384), (768, 576)], multiscale_mode='range', keep_ratio=True), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32) ]), dict(type='SunRgbdRandomFlip'), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub')), dict( type='Collect3D', keys=['img', 'gt_bboxes_3d', 'gt_labels_3d']) ], classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), filter_empty_gt=True, box_type_3d='Depth')), val=dict( type='SunRgbdMultiViewDataset', data_root='data/sunrgbd/', ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_val.pkl', pipeline=[ dict( type='MultiViewPipeline', n_images=1, transforms=[ dict(type='LoadImageFromFile'), dict(type='Resize', img_scale=(640, 480), keep_ratio=True), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32) ]), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['img']) ], classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), test_mode=True, box_type_3d='Depth'), test=dict( type='SunRgbdMultiViewDataset', data_root='data/sunrgbd/', ann_file='data/sunrgbd/sunrgbd_imvoxelnet_infos_val.pkl', pipeline=[ dict( type='MultiViewPipeline', n_images=1, transforms=[ dict(type='LoadImageFromFile'), dict(type='Resize', img_scale=(640, 480), keep_ratio=True), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32) ]), dict( type='DefaultFormatBundle3D', class_names=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), with_label=False), dict(type='Collect3D', keys=['img']) ], classes=('bed', 'table', 'sofa', 'chair', 'toilet', 'desk', 'dresser', 'night_stand', 'bookshelf', 'bathtub'), test_mode=True, box_type_3d='Depth')) optimizer = dict( type='AdamW', lr=0.0001, weight_decay=0.0001, paramwise_cfg=dict( custom_keys=dict(backbone=dict(lr_mult=0.1, decay_mult=1.0)))) optimizer_config = dict(grad_clip=dict(max_norm=35.0, norm_type=2)) lr_config = dict(policy='step', step=[8, 11]) total_epochs = 12 checkpoint_config = dict(interval=1, max_keep_ckpts=1) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) evaluation = dict(interval=1) dist_params = dict(backend='nccl') find_unused_parameters = True log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] work_dir = './work_dirs/imvoxelnet_sunrgbd' gpu_ids = range(0, 1)

2023-03-13 16:23:19,450 - mmdet3d - INFO - Set random seed to 0, deterministic: False /home/ly/Desktop/mmdetection3d/mmdet3d/models/builder.py:86: UserWarning: train_cfg and test_cfg is deprecated, please specify them in model 'please specify them in model', UserWarning) Traceback (most recent call last): File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg return obj_cls(**args) TypeError: init() got an unexpected keyword argument 'voxel_size'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "tools/train.py", line 166, in main() File "tools/train.py", line 139, in main cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg) File "/home/ly/Desktop/mmdetection3d/mmdet3d/models/builder.py", line 93, in build_detector cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 237, in build return self.build_func(*args, **kwargs, registry=self) File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/home/ly/anaconda3/envs/openmmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: ImVoxelNet: init() got an unexpected keyword argument 'voxel_size'` Is there a problem with my environment configuration? I tried many ways to solve, can you help

filaPro commented 1 year ago

Looks like you are running config from samsunglabs/imvoxelnet containing voxel_size argument in the openmmlab/mmdetection3d codebase without this parameter.

I'm not sure about running the model on NVidia Jetson. Basically it could be possible, as all trainable layers are directly from pytorch e.g. Conv2D or Conv3D. But you somehow need to figure out the code with 2d-3d reprojection and NMS function in preprocessing.

ToTensor commented 1 year ago

Is there a problem with my compiling mmdet3d? But I compiled it successfully. How can I solve it

filaPro commented 1 year ago

Basically, I think you first need to peek one of this 2 implementations. If you use mmdetection3d you don't install imvoxelnet and and vice versa. So, now you use master branch of mmdetection3d or imvoxelnet?