Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
MIT License
1.47k stars 158 forks source link

Add Custom Dataset with no 'normal' #196

Open Laventna opened 5 months ago

Laventna commented 5 months ago

Hello! Thanks for your great contribution!

I process my own dataset same as s3dis, but without normal. And I refer to https://github.com/Pointcept/Pointcept/issues/163#issuecomment-1991969376 , remove all the normal in the config, and still get an error "in call data_dict[key] = data_dict[key][idx_unique] KeyError: 'normal'" screenshot: image

Config: weight = None resume = False evaluate = True test_only = False seed = 29384037 save_path = 'exp/s3dis/powerline' num_worker = 10 batch_size = 4 batch_size_val = None batch_size_test = None epoch = 3000 eval_epoch = 100 sync_bn = False enable_amp = False empty_cache = False find_unused_parameters = False mix_prob = 0.8 param_dicts = [dict(keyword='block', lr=0.0006)] hooks = [ dict(type='CheckpointLoader'), dict(type='IterationTimer', warmup_iter=2), dict(type='InformationWriter'), dict(type='SemSegEvaluator'), dict(type='CheckpointSaver', save_freq=None), dict(type='PreciseEvaluator', test_last=False) ] train = dict(type='DefaultTrainer') test = dict(type='SemSegTester', verbose=True) model = dict( type='DefaultSegmentorV2', num_classes=8, backbone_out_channels=64, backbone=dict( type='PT-v3m1', in_channels=6, order=['z', 'z-trans', 'hilbert', 'hilbert-trans'], stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(128, 128, 128, 128, 128), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(128, 128, 128, 128), mlp_ratio=4, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.3, shuffle_orders=True, pre_norm=True, enable_rpe=True, enable_flash=False, upcast_attention=True, upcast_softmax=True, cls_mode=False, pdnorm_bn=False, pdnorm_ln=False, pdnorm_decouple=True, pdnorm_adaptive=False, pdnorm_affine=True, pdnorm_conditions=('ScanNet', 'S3DIS', 'Structured3D')), criteria=[ dict(type='CrossEntropyLoss', loss_weight=1.0, ignore_index=-1), dict( type='LovaszLoss', mode='multiclass', loss_weight=1.0, ignore_index=-1) ]) optimizer = dict(type='AdamW', lr=0.006, weight_decay=0.05) scheduler = dict( type='OneCycleLR', max_lr=[0.006, 0.0006], pct_start=0.05, anneal_strategy='cos', div_factor=10.0, final_div_factor=1000.0) dataset_type = 'S3DISDataset' data_root = 'data/powerline' data = dict( num_classes=8, ignore_index=-1, names=[ 'line', 'town', 'Ground-line', 'drainage-line', 'insulator', 'tree', 'land', 'clutter' ], train=dict( type='S3DISDataset', split=('Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'), data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict( type='RandomDropout', dropout_ratio=0.2, dropout_application_ratio=0.2), dict( type='RandomRotate', angle=[-1, 1], axis='z', center=[0, 0, 0], p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='x', p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='y', p=0.5), dict(type='RandomScale', scale=[0.9, 1.1]), dict(type='RandomFlip', p=0.5), dict(type='RandomJitter', sigma=0.005, clip=0.02), dict(type='ChromaticAutoContrast', p=0.2, blend_factor=None), dict(type='ChromaticTranslation', p=0.95, ratio=0.05), dict(type='ChromaticJitter', p=0.95, std=0.05), dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='train', return_grid_coord=True), dict(type='SphereCrop', sample_rate=0.6, mode='random'), dict(type='SphereCrop', point_max=204800, mode='random'), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'segment'), feat_keys='color') ], test_mode=False, loop=30), val=dict( type='S3DISDataset', split='Area_5', data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict( type='Copy', keys_dict=dict(coord='origin_coord', segment='origin_segment')), dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='train', return_grid_coord=True), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'origin_coord', 'segment', 'origin_segment'), offset_keys_dict=dict( offset='coord', origin_offset='origin_coord'), feat_keys='color') ], test_mode=False), test=dict( type='S3DISDataset', split='Area_5', data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict(type='NormalizeColor') ], test_mode=True, test_cfg=dict( voxelize=dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='test', keys=('coord', 'color'), return_grid_coord=True), crop=None, post_transform=[ dict(type='CenterShift', apply_z=False), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'index'), feat_keys='color') ], aug_transform=[[{ 'type': 'RandomScale', 'scale': [0.9, 0.9] }], [{ 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomScale', 'scale': [1, 1] }], [{ 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomScale', 'scale': [1.1, 1.1] }], [{ 'type': 'RandomScale', 'scale': [0.9, 0.9] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [0.95, 0.95] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1, 1] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1.05, 1.05] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1.1, 1.1] }, { 'type': 'RandomFlip', 'p': 1 }]])))

Looking forward to your reply and Thanks in advance!

Gofinge commented 5 months ago

Hi, add a parameter to GridSampling augmentation: keys=("coord", "color", "segment"). (Default parameter contain normal in this list)

Laventna commented 5 months ago

Thanks for your patient reply, this problem has been solved

DoubleV6317 commented 5 months ago

I am doing a minimal preprocessing on s3dis dataset without adding aligned angle and normal vector and the result obtained by preprocessing is encountering KeyError: 'normal' problem during training. I solved the KeyError: 'normal' problem through your discussion but immediately I got a new problem KeyError: 'c', I am clueless about this and would like to get your help. Looking forward to your reply and Thanks in advance! 1712371595765

Laventna commented 5 months ago

I am doing a minimal preprocessing on s3dis dataset without adding aligned angle and normal vector and the result obtained by preprocessing is encountering KeyError: 'normal' problem during training. I solved the KeyError: 'normal' problem through your discussion but immediately I got a new problem KeyError: 'c', I am clueless about this and would like to get your help. Looking forward to your reply and Thanks in advance! 1712371595765

I just reset the feat_key to ("coord", "color") just like https://github.com/Pointcept/Pointcept/issues/108#issuecomment-1892571447 or https://github.com/Pointcept/Pointcept/issues/188#issuecomment-2037334770 could help

Laventna commented 5 months ago

Hello @Gofinge, sorry to bother you again. When I start the test, the process gets stuck, and no error is reported, I don't know what's wrong with it.

image

Config:

weight = 'exp/s3dis/powerline_msl_3/model/model_last.pth' resume = True evaluate = True test_only = False seed = 28601693 save_path = 'exp/s3dis/powerline_msl_3' num_worker = 4 batch_size = 2 batch_size_val = None batch_size_test = None epoch = 1000 eval_epoch = 100 sync_bn = False enable_amp = False empty_cache = False find_unused_parameters = False mix_prob = 0.8 param_dicts = [dict(keyword='block', lr=0.0006)] hooks = [ dict(type='CheckpointLoader'), dict(type='IterationTimer', warmup_iter=2), dict(type='InformationWriter'), dict(type='SemSegEvaluator'), dict(type='CheckpointSaver', save_freq=None), dict(type='PreciseEvaluator', test_last=False) ] train = dict(type='DefaultTrainer') test = dict(type='SemSegTester', verbose=True) model = dict( type='DefaultSegmentorV2', num_classes=8, backbone_out_channels=64, backbone=dict( type='PT-v3m1', in_channels=6, order=['z', 'z-trans', 'hilbert', 'hilbert-trans'], stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(128, 128, 128, 128, 128), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(128, 128, 128, 128), mlp_ratio=4, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.3, shuffle_orders=True, pre_norm=True, enable_rpe=True, enable_flash=False, upcast_attention=True, upcast_softmax=True, cls_mode=False, pdnorm_bn=False, pdnorm_ln=False, pdnorm_decouple=True, pdnorm_adaptive=False, pdnorm_affine=True, pdnorm_conditions=('ScanNet', 'S3DIS', 'Structured3D')), criteria=[ dict(type='CrossEntropyLoss', loss_weight=1.0, ignore_index=-1), dict( type='LovaszLoss', mode='multiclass', loss_weight=1.0, ignore_index=-1) ]) optimizer = dict(type='AdamW', lr=0.006, weight_decay=0.05) scheduler = dict( type='OneCycleLR', max_lr=[0.006, 0.0006], pct_start=0.05, anneal_strategy='cos', div_factor=10.0, final_div_factor=1000.0) dataset_type = 'S3DISDataset' data_root = 'data/powerline' data = dict( num_classes=8, ignore_index=-1, names=[ 'land', 'tree', 'line', 'Ground-line', 'drainage-line', 'insulator', 'tower', 'clutter' ], train=dict( type='S3DISDataset', split=('Area_1', 'Area_2', 'Area_3', 'Area_4', 'Area_6'), data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict( type='RandomDropout', dropout_ratio=0.2, dropout_application_ratio=0.2), dict( type='RandomRotate', angle=[-1, 1], axis='z', center=[0, 0, 0], p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='x', p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='y', p=0.5), dict(type='RandomScale', scale=[0.9, 1.1]), dict(type='RandomFlip', p=0.5), dict(type='RandomJitter', sigma=0.005, clip=0.02), dict(type='ChromaticAutoContrast', p=0.2, blend_factor=None), dict(type='ChromaticTranslation', p=0.95, ratio=0.05), dict(type='ChromaticJitter', p=0.95, std=0.05), dict( type='GridSample', grid_size=0.2, hash_type='fnv', mode='train', keys=('coord', 'color', 'segment'), return_grid_coord=True), dict(type='SphereCrop', sample_rate=0.6, mode='random'), dict(type='SphereCrop', point_max=102400, mode='random'), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'segment'), feat_keys=('coord', 'color')) ], test_mode=False, loop=10), val=dict( type='S3DISDataset', split='Area_5', data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict( type='Copy', keys_dict=dict(coord='origin_coord', segment='origin_segment')), dict( type='GridSample', grid_size=0.2, hash_type='fnv', mode='train', keys=('coord', 'color', 'segment'), return_grid_coord=True), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'origin_coord', 'segment', 'origin_segment'), offset_keys_dict=dict( offset='coord', origin_offset='origin_coord'), feat_keys=('coord', 'color')) ], test_mode=False), test=dict( type='S3DISDataset', split='Area_5', data_root='data/powerline', transform=[ dict(type='CenterShift', apply_z=True), dict(type='NormalizeColor') ], test_mode=True, test_cfg=dict( voxelize=dict( type='GridSample', grid_size=0.2, hash_type='fnv', mode='test', keys=('coord', 'color'), return_grid_coord=True), crop=None, post_transform=[ dict(type='CenterShift', apply_z=False), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'index'), feat_keys=('coord', 'color')) ], aug_transform=[[{ 'type': 'RandomScale', 'scale': [0.9, 0.9] }], [{ 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomScale', 'scale': [1, 1] }], [{ 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomScale', 'scale': [1.1, 1.1] }], [{ 'type': 'RandomScale', 'scale': [0.9, 0.9] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [0.95, 0.95] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1, 1] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1.05, 1.05] }, { 'type': 'RandomFlip', 'p': 1 }], [{ 'type': 'RandomScale', 'scale': [1.1, 1.1] }, { 'type': 'RandomFlip', 'p': 1 }]])))

DoubleV6317 commented 5 months ago

I am doing a minimal preprocessing on s3dis dataset without adding aligned angle and normal vector and the result obtained by preprocessing is encountering KeyError: 'normal' problem during training. I solved the KeyError: 'normal' problem through your discussion but immediately I got a new problem KeyError: 'c', I am clueless about this and would like to get your help. Looking forward to your reply and Thanks in advance! 1712371595765

I just reset the feat_key to ("coord", "color") just like #108 (comment) or #188 (comment) could help

Thank you very much, it work!

Gofinge commented 5 months ago

he test, the process gets stuck, and no error is reported,

Sorry, I am also don't know about the reason. Maybe you can try to run it with a single GPU?

Laventna commented 5 months ago

@Gofinge have some question about outdoor custom dataset. I wanna know how to train with PTV3 if the dataset is similar to the s3dis format.

I rewrite the preprocessing file to generate *.pth file, and align the data root, class_num... in the config, and everything else in the way s3dis does it. I know this may be a bit wierd, but what should I change to make it work for outdoor data?

And about the input? One *.pth is as a sample for the train? If so, that mean there's a limit to the maximum number of points? At the beginning the maximum point number is about 10M and it doesn’t work. Then I divide the data to get the maximum point number as 4M, the training can be done, but testing goes wrong, and the train mIoU is very poor.

Because I'm a beginner, I'm sorry if my questions might be a little bit stupid and I really hope you can help me. Thanks in advance!!!!

Gofinge commented 4 months ago

If you mean the point cloud you need to handle contains a huge number of points, I think subsampling it during preprocessing is a good choice. I think 10M is too much for current 3D models. During training, you can also adapt a larger grid size to make sure the model can run efficiently and then explore how to boost the performance.