SamsungLabs / tr3d

[ICIP2023] TR3D: Towards Real-Time Indoor 3D Object Detection
Other
154 stars 10 forks source link

train is ok bug inference maybe some bug??? #28

Open jiachen0212 opened 8 months ago

jiachen0212 commented 8 months ago

Hello, I have a question. I used my own data to create a scannet format and the train model got the following results. It seems normal. (The train val data is consistent so this value looks good), and the indicators of each category look like normal.

1711700718173

However, when using the trained model for inference, I found that the detected box categories were all 0, and the orientation of the box was wrong... The conf I used is as follows,

1711700842471 1711701021250

test_pipeline = [ dict( type='LoadPointsFromFile', coord_type='DEPTH', shift_height=False, use_color=True, load_dim=6, use_dim=[0, 1, 2, 3, 4, 5]), dict(type='GlobalAlignment', rotation_axis=2), dict( type='MultiScaleFlipAug3D', img_scale=(1333, 800), pts_scale_ratio=1, flip=False, transforms=[ dict(type='NormalizePointsColor', color_mean=None), dict( type='DefaultFormatBundle3D', class_names=class_names, with_label=False), dict(type='Collect3D', keys=['points']) ]) ]

Do you know where the problem might occur?

filaPro commented 8 months ago

Can you share the full config .py file? Also to be sure, train and validation is ok, but test is not ok?

jiachen0212 commented 8 months ago

Can you share the full config .py file? Also to be sure, train and validation is ok, but test is not ok?

Wow, thank you for replying so quickly. My config is as follows:

voxel_size = .01
n_points = 100000

model = dict(
    type='MinkSingleStage3DDetector',
    voxel_size=voxel_size,
    backbone=dict(type='MinkResNet', in_channels=3, max_channels=128, depth=34, norm='batch'),
    neck=dict(
        type='TR3DNeck',
        in_channels=(64, 128, 128, 128),
        out_channels=128),
    head=dict(
        type='TR3DHead',
        in_channels=128,
        n_reg_outs=6,
        n_classes=18,   # 这个得根据类别修改~ 
        voxel_size=voxel_size,
        assigner=dict(
            type='TR3DAssigner',
            top_pts_threshold=6,
            label2level=[0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0]),    
        bbox_loss=dict(type='AxisAlignedIoULoss', mode='diou', reduction='none')),
    train_cfg=dict(),
    test_cfg=dict(nms_pre=1000, iou_thr=.5, score_thr=.01))

optimizer = dict(type='AdamW', lr=.001, weight_decay=.0001)
optimizer_config = dict(grad_clip=dict(max_norm=10, norm_type=2))
lr_config = dict(policy='step', warmup=None, step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=17)
custom_hooks = [dict(type='EmptyCacheHook', after_iter=True)]

checkpoint_config = dict(interval=1, max_keep_ckpts=1)
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
])
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = None
load_from = None
resume_from = None
workflow = [('train', 1)]

dataset_type = 'ScanNetDataset'
data_root = './data/scannet/'
class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
               'bookshelf', 'picture', 'counter', 'desk', 'curtain',
               'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
               'garbagebin')

train_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='DEPTH',
        shift_height=False,
        use_color=True,
        load_dim=6,
        use_dim=[0, 1, 2, 3, 4, 5]),
    dict(type='LoadAnnotations3D'),
    dict(type='GlobalAlignment', rotation_axis=2),
    # we do not sample 100k points for scannet, as very few scenes have
    # significantly more then 100k points. so we sample 33 to 100% of them
    dict(type='PointSample', num_points=.33),
    dict(
        type='RandomFlip3D',
        sync_2d=False,
        flip_ratio_bev_horizontal=.5,
        flip_ratio_bev_vertical=.5),
    dict(
        type='GlobalRotScaleTrans',
        rot_range=[-.02, .02],
        scale_ratio_range=[.9, 1.1],
        translation_std=[.1, .1, .1],
        shift_height=False),
    dict(type='NormalizePointsColor', color_mean=None),
    dict(type='DefaultFormatBundle3D', class_names=class_names),
    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='DEPTH',
        shift_height=False,
        use_color=True,
        load_dim=6,
        use_dim=[0, 1, 2, 3, 4, 5]),
    dict(type='GlobalAlignment', rotation_axis=2),
    dict(
        type='MultiScaleFlipAug3D',
        img_scale=(1333, 800),
        pts_scale_ratio=1,
        flip=False,
        transforms=[
            dict(type='NormalizePointsColor', color_mean=None),
            dict(
                type='DefaultFormatBundle3D',
                class_names=class_names,
                with_label=False),   
            dict(type='Collect3D', keys=['points'])
        ])
]
data = dict(
    samples_per_gpu=16,
    workers_per_gpu=4,
    train=dict(
        type='RepeatDataset',
        times=15,
        dataset=dict(
            type=dataset_type,
            data_root=data_root,
            ann_file=data_root + 'scannet_infos_train.pkl',
            pipeline=train_pipeline,
            filter_empty_gt=False,
            classes=class_names,
            box_type_3d='Depth')),
    val=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'scannet_infos_train.pkl',
        pipeline=test_pipeline,
        classes=class_names,
        test_mode=True,
        box_type_3d='Depth'),
    test=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'scannet_infos_train.pkl',   
        pipeline=test_pipeline,
        classes=class_names,
        test_mode=True,
        box_type_3d='Depth'))

yes, "train and validation is ok, but test is not ok" . i run tools/train.py tools/test.py can get right thing~ but, python demo/pcd_demo.py xxx.bin configs/tr3d/tr3d_scannet-3d-18class.py work_dirs/tr3d_scannet-3d-18class/epoch_17.pth get wrong result. The angle and category of the box are wrong~~

filaPro commented 8 months ago

Ah, this pcd_demo.py thing doesn't work i think. You need to debug it a little bit, to be sure that everything is exactly like in test.py script. One of the main things is dict(type='GlobalAlignment', rotation_axis=2), i think you are missing it in pcd_demo.py, so the walls are rotated from x and y axis.

jiachen0212 commented 8 months ago

pcd_demo

Thank you very much for your reply~ I did some debugging according to your tips and found that I used GlobalAlignment(rotation_axis=2) in pcd_demo. But the direction of the box is still not much. As for the checkout category of the box, I found the answer , they are placed in the 'labels_3d' keyword of the result. So I am still confused, how can I get the box detection visualization results with the correct direction...~

1711943643144

Detection results of boxes whose directions are not aligned:

1711943596319

filaPro commented 8 months ago

Btw i'm a little confused about rotation. As you use ScanNetDataset and n_reg_outs=6 in your config, we don't even predict rotation in this case. So all roations are zeros and both boxes and walls a parallel to x and y axis.

jiachen0212 commented 8 months ago

Btw i'm a little confused about rotation. As you use ScanNetDataset and n_reg_outs=6 in your config, we don't even predict rotation in this case. So all roations are zeros and both boxes and walls a parallel to x and y axis.

Hmm, I probably understand. Thank you very much for your replies. Maybe it’s a problem with my annotated data. I use my own annotated data and then convert it into the scannet data format. I’ll debug it again, thank you~

jiachen0212 commented 8 months ago

Btw i'm a little confused about rotation. As you use ScanNetDataset and n_reg_outs=6 in your config, we don't even predict rotation in this case. So all roations are zeros and both boxes and walls a parallel to x and y axis.

Hmm, I probably understand. Thank you very much for your replies. Maybe it’s a problem with my annotated data. I use my own annotated data and then convert it into the scannet data format. I’ll debug it again, thank you~

I made some changes to mmdet3d/core/visualizer/open3d_vis.py, and the visualization looks better~

    in_box_color = np.array(points_in_box_color)
    for i in range(len(bbox3d)):
        center = bbox3d[i, 0:3]
        dim = bbox3d[i, 3:6]
        yaw = np.zeros(3)
        # yaw[rot_axis] = bbox3d[i, 6]  # 耦合bug...
        yaw[rot_axis] = math.pi/8   # Manual modification 
        rot_mat = geometry.get_rotation_matrix_from_xyz(yaw)   
        print(rot_mat)

1712030998367

MRCHENWJ commented 5 months ago

Hello, I created my own dataset following the format of the S3DIS dataset and tried to train the network with it. However, I encountered the CUDA out of memory error even though I switched to a GPU with 32GB of VRAM. How can I solve this issue?

2024-06-11 11:19:47,852 - mmdet - INFO - Checkpoints will be saved to /root/autodl-tmp/tr3d-main/work_dirs/tr3d_s3dis-3d-5class by HardDiskBackend. Traceback (most recent call last): File "tools/train.py", line 263, in main() File "tools/train.py", line 252, in main train_model( File "/root/autodl-tmp/tr3d-main/mmdet3d/apis/train.py", line 344, in train_model train_detector( File "/root/autodl-tmp/tr3d-main/mmdet3d/apis/train.py", line 319, in train_detector runner.run(data_loaders, cfg.workflow) File "/root/miniconda3/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], kwargs) File "/root/miniconda3/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 53, in train self.run_iter(data_batch, train_mode=True, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/root/miniconda3/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step return self.module.train_step(inputs[0], kwargs[0]) File "/root/miniconda3/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 248, in train_step losses = self(data) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/root/miniconda3/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 116, in new_func return old_func(args, kwargs) File "/root/autodl-tmp/tr3d-main/mmdet3d/models/detectors/base.py", line 60, in forward return self.forward_train(kwargs) File "/root/autodl-tmp/tr3d-main/mmdet3d/models/detectors/mink_single_stage.py", line 86, in forward_train x = self.extract_feats(points) File "/root/autodl-tmp/tr3d-main/mmdet3d/models/detectors/mink_single_stage.py", line 70, in extract_feats x = self.neck(x) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/root/autodl-tmp/tr3d-main/mmdet3d/models/necks/tr3d_neck.py", line 53, in forward x = inputs[i] + x File "/root/miniconda3/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiTensor.py", line 556, in add return self._binary_functor(other, lambda x, y: x + y) File "/root/miniconda3/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiTensor.py", line 531, in _binary_functor out_F = torch.zeros( RuntimeError: CUDA out of memory. Tried to allocate 2.26 GiB (GPU 0; 15.74 GiB total capacity; 13.21 GiB already allocated; 471.56 MiB free; 13.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

filaPro commented 5 months ago

Hard to say what is wrong with your dataset, as you don't give much details. I recommend to tune voxel_size, n_points, and samples_per_gpu in the config file.