Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
MIT License
1.3k stars 141 forks source link

CUDA:out of memory #241

Open unrestrainednatural opened 1 month ago

unrestrainednatural commented 1 month ago

大佬你们好,请问使用48GB的单张A40显卡能否在scannet数据集上训练PTv3(没有使用flash attention),初始的batchsize为12时,发生了爆显存的情况,改成6依旧爆显存,将其改为1才能开始训练。 下面是详细的报错情况 (pointcept) root@1sq9f61k2p5u-0:~/zwj/Pointcept-main# sh scripts/train.sh -g 1 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base Experiment name: semseg-pt-v3m1-0-base Python interpreter dir: python Dataset: scannet Config: semseg-pt-v3m1-0-base GPU Num: 1 =========> CREATE EXP DIR <========= Experiment dir: /root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base Loading config in: configs/scannet/semseg-pt-v3m1-0-base.py Running code in: exp/scannet/semseg-pt-v3m1-0-base/code =========> RUN TASK <========= [2024-05-06 12:58:18,338 INFO train.py line 128 173332] => Loading config ... [2024-05-06 12:58:18,338 INFO train.py line 130 173332] Save path: exp/scannet/semseg-pt-v3m1-0-base [2024-05-06 12:58:19,167 INFO train.py line 131 173332] Config: weight = None resume = False evaluate = True test_only = False seed = 17779296 save_path = 'exp/scannet/semseg-pt-v3m1-0-base' num_worker = 24 batch_size = 12 batch_size_val = None batch_size_test = None epoch = 800 eval_epoch = 100 sync_bn = False enable_amp = True empty_cache = False find_unused_parameters = False mix_prob = 0.8 param_dicts = [dict(keyword='block', lr=0.0006)] hooks = [ dict(type='CheckpointLoader'), dict(type='IterationTimer', warmup_iter=2), dict(type='InformationWriter'), dict(type='SemSegEvaluator'), dict(type='CheckpointSaver', save_freq=None), dict(type='PreciseEvaluator', test_last=False) ] train = dict(type='DefaultTrainer') test = dict(type='SemSegTester', verbose=True) model = dict( type='DefaultSegmentorV2', num_classes=20, backbone_out_channels=64, backbone=dict( type='PT-v3m1', in_channels=6, order=('z', 'z-trans', 'hilbert', 'hilbert-trans'), stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(1024, 1024, 1024, 1024, 1024), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(1024, 1024, 1024, 1024), mlp_ratio=4, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, drop_path=0.3, shuffle_orders=True, pre_norm=True, enable_rpe=False, enable_flash=False, upcast_attention=False, upcast_softmax=False, cls_mode=False, pdnorm_bn=False, pdnorm_ln=False, pdnorm_decouple=True, pdnorm_adaptive=False, pdnorm_affine=True, pdnorm_conditions=('ScanNet', 'S3DIS', 'Structured3D')), criteria=[ dict(type='CrossEntropyLoss', loss_weight=1.0, ignore_index=-1), dict( type='LovaszLoss', mode='multiclass', loss_weight=1.0, ignore_index=-1) ]) optimizer = dict(type='AdamW', lr=0.006, weight_decay=0.05) scheduler = dict( type='OneCycleLR', max_lr=[0.006, 0.0006], pct_start=0.05, anneal_strategy='cos', div_factor=10.0, final_div_factor=1000.0) dataset_type = 'ScanNetDataset' data_root = 'data/scannet' data = dict( num_classes=20, ignore_index=-1, names=[ 'wall', 'floor', 'cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window', 'bookshelf', 'picture', 'counter', 'desk', 'curtain', 'refridgerator', 'shower curtain', 'toilet', 'sink', 'bathtub', 'otherfurniture' ], train=dict( type='ScanNetDataset', split='train', data_root='data/scannet', transform=[ dict(type='CenterShift', apply_z=True), dict( type='RandomDropout', dropout_ratio=0.2, dropout_application_ratio=0.2), dict( type='RandomRotate', angle=[-1, 1], axis='z', center=[0, 0, 0], p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='x', p=0.5), dict( type='RandomRotate', angle=[-0.015625, 0.015625], axis='y', p=0.5), dict(type='RandomScale', scale=[0.9, 1.1]), dict(type='RandomFlip', p=0.5), dict(type='RandomJitter', sigma=0.005, clip=0.02), dict( type='ElasticDistortion', distortion_params=[[0.2, 0.4], [0.8, 1.6]]), dict(type='ChromaticAutoContrast', p=0.2, blend_factor=None), dict(type='ChromaticTranslation', p=0.95, ratio=0.05), dict(type='ChromaticJitter', p=0.95, std=0.05), dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='train', return_grid_coord=True), dict(type='SphereCrop', point_max=102400, mode='random'), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'segment'), feat_keys=('color', 'normal')) ], test_mode=False, loop=8), val=dict( type='ScanNetDataset', split='val', data_root='data/scannet', transform=[ dict(type='CenterShift', apply_z=True), dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='train', return_grid_coord=True), dict(type='CenterShift', apply_z=False), dict(type='NormalizeColor'), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'segment'), feat_keys=('color', 'normal')) ], test_mode=False), test=dict( type='ScanNetDataset', split='val', data_root='data/scannet', transform=[ dict(type='CenterShift', apply_z=True), dict(type='NormalizeColor') ], test_mode=True, test_cfg=dict( voxelize=dict( type='GridSample', grid_size=0.02, hash_type='fnv', mode='test', keys=('coord', 'color', 'normal'), return_grid_coord=True), crop=None, post_transform=[ dict(type='CenterShift', apply_z=False), dict(type='ToTensor'), dict( type='Collect', keys=('coord', 'grid_coord', 'index'), feat_keys=('color', 'normal')) ], aug_transform=[[{ 'type': 'RandomRotateTargetAngle', 'angle': [0], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [0.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [0], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [0.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [0.95, 0.95] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [0], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [0.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomRotateTargetAngle', 'angle': [1.5], 'axis': 'z', 'center': [0, 0, 0], 'p': 1 }, { 'type': 'RandomScale', 'scale': [1.05, 1.05] }], [{ 'type': 'RandomFlip', 'p': 1 }]]))) num_worker_per_gpu = 24 batch_size_per_gpu = 12 batch_size_val_per_gpu = 1 batch_size_test_per_gpu = 1

[2024-05-06 12:58:19,167 INFO train.py line 132 173332] => Building model ... [2024-05-06 12:58:23,484 INFO train.py line 209 173332] Num params: 46167572 [2024-05-06 12:58:24,947 INFO train.py line 134 173332] => Building writer ... [2024-05-06 12:58:24,949 INFO train.py line 219 173332] Tensorboard writer logging dir: exp/scannet/semseg-pt-v3m1-0-base [2024-05-06 12:58:24,950 INFO train.py line 136 173332] => Building train dataset & dataloader ... [2024-05-06 12:58:24,953 INFO scannet.py line 72 173332] Totally 1201 x 8 samples in train set. [2024-05-06 12:58:24,953 INFO train.py line 138 173332] => Building val dataset & dataloader ... [2024-05-06 12:58:24,954 INFO scannet.py line 72 173332] Totally 312 x 1 samples in val set. [2024-05-06 12:58:24,954 INFO train.py line 140 173332] => Building optimize, scheduler, scaler(amp) ... [2024-05-06 12:58:24,957 INFO optimizer.py line 54 173332] Params Group 1 - lr: 0.006; Params: ['seg_head.weight', 'seg_head.bias', 'backbone.embedding.stem.conv.weight', 'backbone.embedding.stem.norm.weight', 'backbone.embedding.stem.norm.bias', 'backbone.enc.enc1.down.proj.weight', 'backbone.enc.enc1.down.proj.bias', 'backbone.enc.enc1.down.norm.0.weight', 'backbone.enc.enc1.down.norm.0.bias', 'backbone.enc.enc2.down.proj.weight', 'backbone.enc.enc2.down.proj.bias', 'backbone.enc.enc2.down.norm.0.weight', 'backbone.enc.enc2.down.norm.0.bias', 'backbone.enc.enc3.down.proj.weight', 'backbone.enc.enc3.down.proj.bias', 'backbone.enc.enc3.down.norm.0.weight', 'backbone.enc.enc3.down.norm.0.bias', 'backbone.enc.enc4.down.proj.weight', 'backbone.enc.enc4.down.proj.bias', 'backbone.enc.enc4.down.norm.0.weight', 'backbone.enc.enc4.down.norm.0.bias', 'backbone.dec.dec3.up.proj.0.weight', 'backbone.dec.dec3.up.proj.0.bias', 'backbone.dec.dec3.up.proj.1.weight', 'backbone.dec.dec3.up.proj.1.bias', 'backbone.dec.dec3.up.proj_skip.0.weight', 'backbone.dec.dec3.up.proj_skip.0.bias', 'backbone.dec.dec3.up.proj_skip.1.weight', 'backbone.dec.dec3.up.proj_skip.1.bias', 'backbone.dec.dec2.up.proj.0.weight', 'backbone.dec.dec2.up.proj.0.bias', 'backbone.dec.dec2.up.proj.1.weight', 'backbone.dec.dec2.up.proj.1.bias', 'backbone.dec.dec2.up.proj_skip.0.weight', 'backbone.dec.dec2.up.proj_skip.0.bias', 'backbone.dec.dec2.up.proj_skip.1.weight', 'backbone.dec.dec2.up.proj_skip.1.bias', 'backbone.dec.dec1.up.proj.0.weight', 'backbone.dec.dec1.up.proj.0.bias', 'backbone.dec.dec1.up.proj.1.weight', 'backbone.dec.dec1.up.proj.1.bias', 'backbone.dec.dec1.up.proj_skip.0.weight', 'backbone.dec.dec1.up.proj_skip.0.bias', 'backbone.dec.dec1.up.proj_skip.1.weight', 'backbone.dec.dec1.up.proj_skip.1.bias', 'backbone.dec.dec0.up.proj.0.weight', 'backbone.dec.dec0.up.proj.0.bias', 'backbone.dec.dec0.up.proj.1.weight', 'backbone.dec.dec0.up.proj.1.bias', 'backbone.dec.dec0.up.proj_skip.0.weight', 'backbone.dec.dec0.up.proj_skip.0.bias', 'backbone.dec.dec0.up.proj_skip.1.weight', 'backbone.dec.dec0.up.proj_skip.1.bias']. [2024-05-06 12:58:24,957 INFO optimizer.py line 54 173332] Params Group 2 - lr: 0.0006; Params: ['backbone.enc.enc0.block0.cpe.0.weight', 'backbone.enc.enc0.block0.cpe.0.bias', 'backbone.enc.enc0.block0.cpe.1.weight', 'backbone.enc.enc0.block0.cpe.1.bias', 'backbone.enc.enc0.block0.cpe.2.weight', 'backbone.enc.enc0.block0.cpe.2.bias', 'backbone.enc.enc0.block0.norm1.0.weight', 'backbone.enc.enc0.block0.norm1.0.bias', 'backbone.enc.enc0.block0.attn.qkv.weight', 'backbone.enc.enc0.block0.attn.qkv.bias', 'backbone.enc.enc0.block0.attn.proj.weight', 'backbone.enc.enc0.block0.attn.proj.bias', 'backbone.enc.enc0.block0.norm2.0.weight', 'backbone.enc.enc0.block0.norm2.0.bias', 'backbone.enc.enc0.block0.mlp.0.fc1.weight', 'backbone.enc.enc0.block0.mlp.0.fc1.bias', 'backbone.enc.enc0.block0.mlp.0.fc2.weight', 'backbone.enc.enc0.block0.mlp.0.fc2.bias', 'backbone.enc.enc0.block1.cpe.0.weight', 'backbone.enc.enc0.block1.cpe.0.bias', 'backbone.enc.enc0.block1.cpe.1.weight', 'backbone.enc.enc0.block1.cpe.1.bias', 'backbone.enc.enc0.block1.cpe.2.weight', 'backbone.enc.enc0.block1.cpe.2.bias', 'backbone.enc.enc0.block1.norm1.0.weight', 'backbone.enc.enc0.block1.norm1.0.bias', 'backbone.enc.enc0.block1.attn.qkv.weight', 'backbone.enc.enc0.block1.attn.qkv.bias', 'backbone.enc.enc0.block1.attn.proj.weight', 'backbone.enc.enc0.block1.attn.proj.bias', 'backbone.enc.enc0.block1.norm2.0.weight', 'backbone.enc.enc0.block1.norm2.0.bias', 'backbone.enc.enc0.block1.mlp.0.fc1.weight', 'backbone.enc.enc0.block1.mlp.0.fc1.bias', 'backbone.enc.enc0.block1.mlp.0.fc2.weight', 'backbone.enc.enc0.block1.mlp.0.fc2.bias', 'backbone.enc.enc1.block0.cpe.0.weight', 'backbone.enc.enc1.block0.cpe.0.bias', 'backbone.enc.enc1.block0.cpe.1.weight', 'backbone.enc.enc1.block0.cpe.1.bias', 'backbone.enc.enc1.block0.cpe.2.weight', 'backbone.enc.enc1.block0.cpe.2.bias', 'backbone.enc.enc1.block0.norm1.0.weight', 'backbone.enc.enc1.block0.norm1.0.bias', 'backbone.enc.enc1.block0.attn.qkv.weight', 'backbone.enc.enc1.block0.attn.qkv.bias', 'backbone.enc.enc1.block0.attn.proj.weight', 'backbone.enc.enc1.block0.attn.proj.bias', 'backbone.enc.enc1.block0.norm2.0.weight', 'backbone.enc.enc1.block0.norm2.0.bias', 'backbone.enc.enc1.block0.mlp.0.fc1.weight', 'backbone.enc.enc1.block0.mlp.0.fc1.bias', 'backbone.enc.enc1.block0.mlp.0.fc2.weight', 'backbone.enc.enc1.block0.mlp.0.fc2.bias', 'backbone.enc.enc1.block1.cpe.0.weight', 'backbone.enc.enc1.block1.cpe.0.bias', 'backbone.enc.enc1.block1.cpe.1.weight', 'backbone.enc.enc1.block1.cpe.1.bias', 'backbone.enc.enc1.block1.cpe.2.weight', 'backbone.enc.enc1.block1.cpe.2.bias', 'backbone.enc.enc1.block1.norm1.0.weight', 'backbone.enc.enc1.block1.norm1.0.bias', 'backbone.enc.enc1.block1.attn.qkv.weight', 'backbone.enc.enc1.block1.attn.qkv.bias', 'backbone.enc.enc1.block1.attn.proj.weight', 'backbone.enc.enc1.block1.attn.proj.bias', 'backbone.enc.enc1.block1.norm2.0.weight', 'backbone.enc.enc1.block1.norm2.0.bias', 'backbone.enc.enc1.block1.mlp.0.fc1.weight', 'backbone.enc.enc1.block1.mlp.0.fc1.bias', 'backbone.enc.enc1.block1.mlp.0.fc2.weight', 'backbone.enc.enc1.block1.mlp.0.fc2.bias', 'backbone.enc.enc2.block0.cpe.0.weight', 'backbone.enc.enc2.block0.cpe.0.bias', 'backbone.enc.enc2.block0.cpe.1.weight', 'backbone.enc.enc2.block0.cpe.1.bias', 'backbone.enc.enc2.block0.cpe.2.weight', 'backbone.enc.enc2.block0.cpe.2.bias', 'backbone.enc.enc2.block0.norm1.0.weight', 'backbone.enc.enc2.block0.norm1.0.bias', 'backbone.enc.enc2.block0.attn.qkv.weight', 'backbone.enc.enc2.block0.attn.qkv.bias', 'backbone.enc.enc2.block0.attn.proj.weight', 'backbone.enc.enc2.block0.attn.proj.bias', 'backbone.enc.enc2.block0.norm2.0.weight', 'backbone.enc.enc2.block0.norm2.0.bias', 'backbone.enc.enc2.block0.mlp.0.fc1.weight', 'backbone.enc.enc2.block0.mlp.0.fc1.bias', 'backbone.enc.enc2.block0.mlp.0.fc2.weight', 'backbone.enc.enc2.block0.mlp.0.fc2.bias', 'backbone.enc.enc2.block1.cpe.0.weight', 'backbone.enc.enc2.block1.cpe.0.bias', 'backbone.enc.enc2.block1.cpe.1.weight', 'backbone.enc.enc2.block1.cpe.1.bias', 'backbone.enc.enc2.block1.cpe.2.weight', 'backbone.enc.enc2.block1.cpe.2.bias', 'backbone.enc.enc2.block1.norm1.0.weight', 'backbone.enc.enc2.block1.norm1.0.bias', 'backbone.enc.enc2.block1.attn.qkv.weight', 'backbone.enc.enc2.block1.attn.qkv.bias', 'backbone.enc.enc2.block1.attn.proj.weight', 'backbone.enc.enc2.block1.attn.proj.bias', 'backbone.enc.enc2.block1.norm2.0.weight', 'backbone.enc.enc2.block1.norm2.0.bias', 'backbone.enc.enc2.block1.mlp.0.fc1.weight', 'backbone.enc.enc2.block1.mlp.0.fc1.bias', 'backbone.enc.enc2.block1.mlp.0.fc2.weight', 'backbone.enc.enc2.block1.mlp.0.fc2.bias', 'backbone.enc.enc3.block0.cpe.0.weight', 'backbone.enc.enc3.block0.cpe.0.bias', 'backbone.enc.enc3.block0.cpe.1.weight', 'backbone.enc.enc3.block0.cpe.1.bias', 'backbone.enc.enc3.block0.cpe.2.weight', 'backbone.enc.enc3.block0.cpe.2.bias', 'backbone.enc.enc3.block0.norm1.0.weight', 'backbone.enc.enc3.block0.norm1.0.bias', 'backbone.enc.enc3.block0.attn.qkv.weight', 'backbone.enc.enc3.block0.attn.qkv.bias', 'backbone.enc.enc3.block0.attn.proj.weight', 'backbone.enc.enc3.block0.attn.proj.bias', 'backbone.enc.enc3.block0.norm2.0.weight', 'backbone.enc.enc3.block0.norm2.0.bias', 'backbone.enc.enc3.block0.mlp.0.fc1.weight', 'backbone.enc.enc3.block0.mlp.0.fc1.bias', 'backbone.enc.enc3.block0.mlp.0.fc2.weight', 'backbone.enc.enc3.block0.mlp.0.fc2.bias', 'backbone.enc.enc3.block1.cpe.0.weight', 'backbone.enc.enc3.block1.cpe.0.bias', 'backbone.enc.enc3.block1.cpe.1.weight', 'backbone.enc.enc3.block1.cpe.1.bias', 'backbone.enc.enc3.block1.cpe.2.weight', 'backbone.enc.enc3.block1.cpe.2.bias', 'backbone.enc.enc3.block1.norm1.0.weight', 'backbone.enc.enc3.block1.norm1.0.bias', 'backbone.enc.enc3.block1.attn.qkv.weight', 'backbone.enc.enc3.block1.attn.qkv.bias', 'backbone.enc.enc3.block1.attn.proj.weight', 'backbone.enc.enc3.block1.attn.proj.bias', 'backbone.enc.enc3.block1.norm2.0.weight', 'backbone.enc.enc3.block1.norm2.0.bias', 'backbone.enc.enc3.block1.mlp.0.fc1.weight', 'backbone.enc.enc3.block1.mlp.0.fc1.bias', 'backbone.enc.enc3.block1.mlp.0.fc2.weight', 'backbone.enc.enc3.block1.mlp.0.fc2.bias', 'backbone.enc.enc3.block2.cpe.0.weight', 'backbone.enc.enc3.block2.cpe.0.bias', 'backbone.enc.enc3.block2.cpe.1.weight', 'backbone.enc.enc3.block2.cpe.1.bias', 'backbone.enc.enc3.block2.cpe.2.weight', 'backbone.enc.enc3.block2.cpe.2.bias', 'backbone.enc.enc3.block2.norm1.0.weight', 'backbone.enc.enc3.block2.norm1.0.bias', 'backbone.enc.enc3.block2.attn.qkv.weight', 'backbone.enc.enc3.block2.attn.qkv.bias', 'backbone.enc.enc3.block2.attn.proj.weight', 'backbone.enc.enc3.block2.attn.proj.bias', 'backbone.enc.enc3.block2.norm2.0.weight', 'backbone.enc.enc3.block2.norm2.0.bias', 'backbone.enc.enc3.block2.mlp.0.fc1.weight', 'backbone.enc.enc3.block2.mlp.0.fc1.bias', 'backbone.enc.enc3.block2.mlp.0.fc2.weight', 'backbone.enc.enc3.block2.mlp.0.fc2.bias', 'backbone.enc.enc3.block3.cpe.0.weight', 'backbone.enc.enc3.block3.cpe.0.bias', 'backbone.enc.enc3.block3.cpe.1.weight', 'backbone.enc.enc3.block3.cpe.1.bias', 'backbone.enc.enc3.block3.cpe.2.weight', 'backbone.enc.enc3.block3.cpe.2.bias', 'backbone.enc.enc3.block3.norm1.0.weight', 'backbone.enc.enc3.block3.norm1.0.bias', 'backbone.enc.enc3.block3.attn.qkv.weight', 'backbone.enc.enc3.block3.attn.qkv.bias', 'backbone.enc.enc3.block3.attn.proj.weight', 'backbone.enc.enc3.block3.attn.proj.bias', 'backbone.enc.enc3.block3.norm2.0.weight', 'backbone.enc.enc3.block3.norm2.0.bias', 'backbone.enc.enc3.block3.mlp.0.fc1.weight', 'backbone.enc.enc3.block3.mlp.0.fc1.bias', 'backbone.enc.enc3.block3.mlp.0.fc2.weight', 'backbone.enc.enc3.block3.mlp.0.fc2.bias', 'backbone.enc.enc3.block4.cpe.0.weight', 'backbone.enc.enc3.block4.cpe.0.bias', 'backbone.enc.enc3.block4.cpe.1.weight', 'backbone.enc.enc3.block4.cpe.1.bias', 'backbone.enc.enc3.block4.cpe.2.weight', 'backbone.enc.enc3.block4.cpe.2.bias', 'backbone.enc.enc3.block4.norm1.0.weight', 'backbone.enc.enc3.block4.norm1.0.bias', 'backbone.enc.enc3.block4.attn.qkv.weight', 'backbone.enc.enc3.block4.attn.qkv.bias', 'backbone.enc.enc3.block4.attn.proj.weight', 'backbone.enc.enc3.block4.attn.proj.bias', 'backbone.enc.enc3.block4.norm2.0.weight', 'backbone.enc.enc3.block4.norm2.0.bias', 'backbone.enc.enc3.block4.mlp.0.fc1.weight', 'backbone.enc.enc3.block4.mlp.0.fc1.bias', 'backbone.enc.enc3.block4.mlp.0.fc2.weight', 'backbone.enc.enc3.block4.mlp.0.fc2.bias', 'backbone.enc.enc3.block5.cpe.0.weight', 'backbone.enc.enc3.block5.cpe.0.bias', 'backbone.enc.enc3.block5.cpe.1.weight', 'backbone.enc.enc3.block5.cpe.1.bias', 'backbone.enc.enc3.block5.cpe.2.weight', 'backbone.enc.enc3.block5.cpe.2.bias', 'backbone.enc.enc3.block5.norm1.0.weight', 'backbone.enc.enc3.block5.norm1.0.bias', 'backbone.enc.enc3.block5.attn.qkv.weight', 'backbone.enc.enc3.block5.attn.qkv.bias', 'backbone.enc.enc3.block5.attn.proj.weight', 'backbone.enc.enc3.block5.attn.proj.bias', 'backbone.enc.enc3.block5.norm2.0.weight', 'backbone.enc.enc3.block5.norm2.0.bias', 'backbone.enc.enc3.block5.mlp.0.fc1.weight', 'backbone.enc.enc3.block5.mlp.0.fc1.bias', 'backbone.enc.enc3.block5.mlp.0.fc2.weight', 'backbone.enc.enc3.block5.mlp.0.fc2.bias', 'backbone.enc.enc4.block0.cpe.0.weight', 'backbone.enc.enc4.block0.cpe.0.bias', 'backbone.enc.enc4.block0.cpe.1.weight', 'backbone.enc.enc4.block0.cpe.1.bias', 'backbone.enc.enc4.block0.cpe.2.weight', 'backbone.enc.enc4.block0.cpe.2.bias', 'backbone.enc.enc4.block0.norm1.0.weight', 'backbone.enc.enc4.block0.norm1.0.bias', 'backbone.enc.enc4.block0.attn.qkv.weight', 'backbone.enc.enc4.block0.attn.qkv.bias', 'backbone.enc.enc4.block0.attn.proj.weight', 'backbone.enc.enc4.block0.attn.proj.bias', 'backbone.enc.enc4.block0.norm2.0.weight', 'backbone.enc.enc4.block0.norm2.0.bias', 'backbone.enc.enc4.block0.mlp.0.fc1.weight', 'backbone.enc.enc4.block0.mlp.0.fc1.bias', 'backbone.enc.enc4.block0.mlp.0.fc2.weight', 'backbone.enc.enc4.block0.mlp.0.fc2.bias', 'backbone.enc.enc4.block1.cpe.0.weight', 'backbone.enc.enc4.block1.cpe.0.bias', 'backbone.enc.enc4.block1.cpe.1.weight', 'backbone.enc.enc4.block1.cpe.1.bias', 'backbone.enc.enc4.block1.cpe.2.weight', 'backbone.enc.enc4.block1.cpe.2.bias', 'backbone.enc.enc4.block1.norm1.0.weight', 'backbone.enc.enc4.block1.norm1.0.bias', 'backbone.enc.enc4.block1.attn.qkv.weight', 'backbone.enc.enc4.block1.attn.qkv.bias', 'backbone.enc.enc4.block1.attn.proj.weight', 'backbone.enc.enc4.block1.attn.proj.bias', 'backbone.enc.enc4.block1.norm2.0.weight', 'backbone.enc.enc4.block1.norm2.0.bias', 'backbone.enc.enc4.block1.mlp.0.fc1.weight', 'backbone.enc.enc4.block1.mlp.0.fc1.bias', 'backbone.enc.enc4.block1.mlp.0.fc2.weight', 'backbone.enc.enc4.block1.mlp.0.fc2.bias', 'backbone.dec.dec3.block0.cpe.0.weight', 'backbone.dec.dec3.block0.cpe.0.bias', 'backbone.dec.dec3.block0.cpe.1.weight', 'backbone.dec.dec3.block0.cpe.1.bias', 'backbone.dec.dec3.block0.cpe.2.weight', 'backbone.dec.dec3.block0.cpe.2.bias', 'backbone.dec.dec3.block0.norm1.0.weight', 'backbone.dec.dec3.block0.norm1.0.bias', 'backbone.dec.dec3.block0.attn.qkv.weight', 'backbone.dec.dec3.block0.attn.qkv.bias', 'backbone.dec.dec3.block0.attn.proj.weight', 'backbone.dec.dec3.block0.attn.proj.bias', 'backbone.dec.dec3.block0.norm2.0.weight', 'backbone.dec.dec3.block0.norm2.0.bias', 'backbone.dec.dec3.block0.mlp.0.fc1.weight', 'backbone.dec.dec3.block0.mlp.0.fc1.bias', 'backbone.dec.dec3.block0.mlp.0.fc2.weight', 'backbone.dec.dec3.block0.mlp.0.fc2.bias', 'backbone.dec.dec3.block1.cpe.0.weight', 'backbone.dec.dec3.block1.cpe.0.bias', 'backbone.dec.dec3.block1.cpe.1.weight', 'backbone.dec.dec3.block1.cpe.1.bias', 'backbone.dec.dec3.block1.cpe.2.weight', 'backbone.dec.dec3.block1.cpe.2.bias', 'backbone.dec.dec3.block1.norm1.0.weight', 'backbone.dec.dec3.block1.norm1.0.bias', 'backbone.dec.dec3.block1.attn.qkv.weight', 'backbone.dec.dec3.block1.attn.qkv.bias', 'backbone.dec.dec3.block1.attn.proj.weight', 'backbone.dec.dec3.block1.attn.proj.bias', 'backbone.dec.dec3.block1.norm2.0.weight', 'backbone.dec.dec3.block1.norm2.0.bias', 'backbone.dec.dec3.block1.mlp.0.fc1.weight', 'backbone.dec.dec3.block1.mlp.0.fc1.bias', 'backbone.dec.dec3.block1.mlp.0.fc2.weight', 'backbone.dec.dec3.block1.mlp.0.fc2.bias', 'backbone.dec.dec2.block0.cpe.0.weight', 'backbone.dec.dec2.block0.cpe.0.bias', 'backbone.dec.dec2.block0.cpe.1.weight', 'backbone.dec.dec2.block0.cpe.1.bias', 'backbone.dec.dec2.block0.cpe.2.weight', 'backbone.dec.dec2.block0.cpe.2.bias', 'backbone.dec.dec2.block0.norm1.0.weight', 'backbone.dec.dec2.block0.norm1.0.bias', 'backbone.dec.dec2.block0.attn.qkv.weight', 'backbone.dec.dec2.block0.attn.qkv.bias', 'backbone.dec.dec2.block0.attn.proj.weight', 'backbone.dec.dec2.block0.attn.proj.bias', 'backbone.dec.dec2.block0.norm2.0.weight', 'backbone.dec.dec2.block0.norm2.0.bias', 'backbone.dec.dec2.block0.mlp.0.fc1.weight', 'backbone.dec.dec2.block0.mlp.0.fc1.bias', 'backbone.dec.dec2.block0.mlp.0.fc2.weight', 'backbone.dec.dec2.block0.mlp.0.fc2.bias', 'backbone.dec.dec2.block1.cpe.0.weight', 'backbone.dec.dec2.block1.cpe.0.bias', 'backbone.dec.dec2.block1.cpe.1.weight', 'backbone.dec.dec2.block1.cpe.1.bias', 'backbone.dec.dec2.block1.cpe.2.weight', 'backbone.dec.dec2.block1.cpe.2.bias', 'backbone.dec.dec2.block1.norm1.0.weight', 'backbone.dec.dec2.block1.norm1.0.bias', 'backbone.dec.dec2.block1.attn.qkv.weight', 'backbone.dec.dec2.block1.attn.qkv.bias', 'backbone.dec.dec2.block1.attn.proj.weight', 'backbone.dec.dec2.block1.attn.proj.bias', 'backbone.dec.dec2.block1.norm2.0.weight', 'backbone.dec.dec2.block1.norm2.0.bias', 'backbone.dec.dec2.block1.mlp.0.fc1.weight', 'backbone.dec.dec2.block1.mlp.0.fc1.bias', 'backbone.dec.dec2.block1.mlp.0.fc2.weight', 'backbone.dec.dec2.block1.mlp.0.fc2.bias', 'backbone.dec.dec1.block0.cpe.0.weight', 'backbone.dec.dec1.block0.cpe.0.bias', 'backbone.dec.dec1.block0.cpe.1.weight', 'backbone.dec.dec1.block0.cpe.1.bias', 'backbone.dec.dec1.block0.cpe.2.weight', 'backbone.dec.dec1.block0.cpe.2.bias', 'backbone.dec.dec1.block0.norm1.0.weight', 'backbone.dec.dec1.block0.norm1.0.bias', 'backbone.dec.dec1.block0.attn.qkv.weight', 'backbone.dec.dec1.block0.attn.qkv.bias', 'backbone.dec.dec1.block0.attn.proj.weight', 'backbone.dec.dec1.block0.attn.proj.bias', 'backbone.dec.dec1.block0.norm2.0.weight', 'backbone.dec.dec1.block0.norm2.0.bias', 'backbone.dec.dec1.block0.mlp.0.fc1.weight', 'backbone.dec.dec1.block0.mlp.0.fc1.bias', 'backbone.dec.dec1.block0.mlp.0.fc2.weight', 'backbone.dec.dec1.block0.mlp.0.fc2.bias', 'backbone.dec.dec1.block1.cpe.0.weight', 'backbone.dec.dec1.block1.cpe.0.bias', 'backbone.dec.dec1.block1.cpe.1.weight', 'backbone.dec.dec1.block1.cpe.1.bias', 'backbone.dec.dec1.block1.cpe.2.weight', 'backbone.dec.dec1.block1.cpe.2.bias', 'backbone.dec.dec1.block1.norm1.0.weight', 'backbone.dec.dec1.block1.norm1.0.bias', 'backbone.dec.dec1.block1.attn.qkv.weight', 'backbone.dec.dec1.block1.attn.qkv.bias', 'backbone.dec.dec1.block1.attn.proj.weight', 'backbone.dec.dec1.block1.attn.proj.bias', 'backbone.dec.dec1.block1.norm2.0.weight', 'backbone.dec.dec1.block1.norm2.0.bias', 'backbone.dec.dec1.block1.mlp.0.fc1.weight', 'backbone.dec.dec1.block1.mlp.0.fc1.bias', 'backbone.dec.dec1.block1.mlp.0.fc2.weight', 'backbone.dec.dec1.block1.mlp.0.fc2.bias', 'backbone.dec.dec0.block0.cpe.0.weight', 'backbone.dec.dec0.block0.cpe.0.bias', 'backbone.dec.dec0.block0.cpe.1.weight', 'backbone.dec.dec0.block0.cpe.1.bias', 'backbone.dec.dec0.block0.cpe.2.weight', 'backbone.dec.dec0.block0.cpe.2.bias', 'backbone.dec.dec0.block0.norm1.0.weight', 'backbone.dec.dec0.block0.norm1.0.bias', 'backbone.dec.dec0.block0.attn.qkv.weight', 'backbone.dec.dec0.block0.attn.qkv.bias', 'backbone.dec.dec0.block0.attn.proj.weight', 'backbone.dec.dec0.block0.attn.proj.bias', 'backbone.dec.dec0.block0.norm2.0.weight', 'backbone.dec.dec0.block0.norm2.0.bias', 'backbone.dec.dec0.block0.mlp.0.fc1.weight', 'backbone.dec.dec0.block0.mlp.0.fc1.bias', 'backbone.dec.dec0.block0.mlp.0.fc2.weight', 'backbone.dec.dec0.block0.mlp.0.fc2.bias', 'backbone.dec.dec0.block1.cpe.0.weight', 'backbone.dec.dec0.block1.cpe.0.bias', 'backbone.dec.dec0.block1.cpe.1.weight', 'backbone.dec.dec0.block1.cpe.1.bias', 'backbone.dec.dec0.block1.cpe.2.weight', 'backbone.dec.dec0.block1.cpe.2.bias', 'backbone.dec.dec0.block1.norm1.0.weight', 'backbone.dec.dec0.block1.norm1.0.bias', 'backbone.dec.dec0.block1.attn.qkv.weight', 'backbone.dec.dec0.block1.attn.qkv.bias', 'backbone.dec.dec0.block1.attn.proj.weight', 'backbone.dec.dec0.block1.attn.proj.bias', 'backbone.dec.dec0.block1.norm2.0.weight', 'backbone.dec.dec0.block1.norm2.0.bias', 'backbone.dec.dec0.block1.mlp.0.fc1.weight', 'backbone.dec.dec0.block1.mlp.0.fc1.bias', 'backbone.dec.dec0.block1.mlp.0.fc2.weight', 'backbone.dec.dec0.block1.mlp.0.fc2.bias']. [2024-05-06 12:58:24,958 INFO train.py line 144 173332] => Building hooks ... [2024-05-06 12:58:24,958 INFO misc.py line 215 173332] => Loading checkpoint & weight ... [2024-05-06 12:58:24,958 INFO misc.py line 251 173332] No weight found at: None [2024-05-06 12:58:24,958 INFO train.py line 151 173332] >>>>>>>>>>>>>>>> Start Training >>>>>>>>>>>>>>>> Traceback (most recent call last): File "exp/scannet/semseg-pt-v3m1-0-base/code/tools/train.py", line 38, in main() File "exp/scannet/semseg-pt-v3m1-0-base/code/tools/train.py", line 27, in main launch( File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/engines/launch.py", line 89, in launch main_func(cfg) File "exp/scannet/semseg-pt-v3m1-0-base/code/tools/train.py", line 20, in main_worker trainer.train() File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/engines/train.py", line 168, in train self.run_step() File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/engines/train.py", line 182, in run_step output_dict = self.model(input_dict) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/default.py", line 54, in forward point = self.backbone(point) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/point_transformer_v3/point_transformer_v3m1_base.py", line 704, in forward point = self.enc(point) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/modules.py", line 62, in forward input = module(input) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/modules.py", line 62, in forward input = module(input) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/point_transformer_v3/point_transformer_v3m1_base.py", line 324, in forward point = self.drop_path(self.attn(point)) File "/root/anaconda3/envs/pointcept/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/root/zwj/Pointcept-main/exp/scannet/semseg-pt-v3m1-0-base/code/pointcept/models/point_transformer_v3/point_transformer_v3m1_base.py", line 198, in forward attn = (q * self.scale) @ k.transpose(-2, -1) # (N', H, K, K) RuntimeError: CUDA out of memory. Tried to allocate 3.27 GiB (GPU 0; 44.35 GiB total capacity; 41.48 GiB already allocated; 1.60 GiB free; 41.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

unrestrainednatural commented 1 month ago

我将batchsize改为了8,enc/dec_patch_size改为128,现在可以训练了

Gofinge commented 1 month ago

Hi, when flash_attention is disabled, please reduce patch_size to 128 or 256.

Bryan1203 commented 1 month ago

I am wondering is it possible to train a PT v2 model on a single RTX A2000 (12 G of memory) or is 24 G of memory a minimum requirement. Will two or more A2000 card work? Will training a PT v3 model take lesser memory?

Bryan1203 commented 4 weeks ago

just checked the point transformer v3 paper, it takes memory under 12G to train, does it mean that I could train on only one a2000 gpu? I am trying to train the semantic kitti dataset using PT v3. Did you use just one rtx 4090 to train or 4 and 8 are needed?

Gofinge commented 3 weeks ago

Hi, we only had a 100 during our experiments, but I think PTv3 is efficient enough to satisfy your request. Just try more, if you encounter some unexpected issue, tell me and I will tell you how to adjust the model.

Bryan1203 commented 3 weeks ago

got it! thanks for the reply and your work. I am wondering if the release of the semantic KITTI config file for PTv3 is on schedule soon?

Gofinge commented 3 weeks ago

It should be available soon!!! I also wish I could have more time! But CVPR is coming soon, too many things need to be finished before that.