PJLab-ADG / OpenPCSeg

OpenPCSeg: Open Source Point Cloud Segmentation Toolbox and Benchmark
371 stars 36 forks source link

ZeroDivisionError: division by zero #26

Open ccdontworry opened 7 months ago

ccdontworry commented 7 months ago

Hello! I encountered an error while running the training file: (pcseg) root@9e1fedc6ce4f :/workspace/data/OpenPCSeg master # Python train. py -- cfg_file tools/cfgs/voxel/semantic_kitti/inkunt_mk34_cr10. yaml 2024-04-16 11:44:23277 INFO ** Start logging** 2024-04-16 11:44:23277 INFO CUDA_VISIBLE-DEVICES=ALL 2024-04-16 11:44:23277 INFO cfg_file tools/cfgs/voxel/semantic_kitti/inkunt_mk34_cr10. yaml 2024-04-16 11:44:23278 INFO extra_tag default April 16, 2024 11:44:23278 INFO set_cfgs None 2024-04-16 11:44:23278 INFO fix_random_seed False April 16, 2024 11:44:23278 INFO batch_size 12 April 16, 2024 11:44:23279 INFO epochs 36 2024-04-16 11:44:23279 INFO sync_bn False April 16, 2024 11:44:23279 INFO ckp None April 16, 2024 11:44:23279 INFO pretrained model None April 16, 2024 11:44:23280 INFO amp False April 16, 2024 11:44:23280 INFO ckp_save-interval 1 April 16, 2024 11:44:23280 INFO max_ckp_savenum 30 2024-04-16 11:44:23280 INFO merge-all_iters.to_one_epoch False April 16, 2024 11:44:23281 INFO eval False April 16, 2024 11:44:23281 INFO eval_interval 50 April 16, 2024 11:44:23281 INFO workers 5 2024-04-16 11:44:23281 INFO local_rank 0 April 16, 2024 11:44:23282 INFO launcher none April 16, 2024 11:44:23282 INFO tcd_port 18888 April 16, 2024 11:44:23282 INFO cfg ROOTDIR:/workspace/data/OpenPCSeg master April 16, 2024 11:44:23282 INFO cfg LOCAL-RANK: 0 April 16, 2024 11:44:23282 INFO cfg MODALITY: voxel April 16, 2024 11:44:23283 INFO Cfg DATA=edit() April 16, 2024 11:44:23283 INFO cfg DATA DATASET: Semantickitti April 16, 2024 11:44:23283 INFO cfg DATA PETRELOSS-CONFIG: None April 16, 2024 11:44:23283 INFO cfg DATA DATA-PATH:/workspace/data/SemanticKITTI/dataset/ April 16, 2024 11:44:23284 INFO cfg DATA VOXEL_SIZE: 0.05 April 16, 2024 11:44:23284 INFO cfg DATA AUGMENT: GlobalAugust LP April 16, 2024 11:44:23284 INFO cfg DATA NUM-POINTS: 1000000 April 16, 2024 11:44:23284 INFO cfg DATA TRAINVAL: False April 16, 2024 11:44:23284 INFO cfg DATA TTA: False April 16, 2024 11:44:23285 INFO Cfg MODEL=edit() April 16, 2024 11:44:23285 INFO cfg MODEL NAME: MinkUNet April 16, 2024 11:44:23285 INFO cfg MODEL IGNORE-LABEL: 0 April 16, 2024 11:44:23285 INFO cfg MODEL IN-FEATUREDIM: 4 April 16, 2024 11:44:23285 INFO cfg MODEL Block: ResBlock April 16, 2024 11:44:23286 INFO cfg MODEL NUM-LAYER: [2, 3, 4, 6, 2, 2, 2, 2, 2] April 16, 2024 11:44:23286 INFO cfg MODEL Plans: [32, 32, 64, 128, 256, 256, 128, 96, 96] April 16, 2024 11:44:23286 INFO cfg MODEL.cr: 1.0 April 16, 2024 11:44:23286 INFO cfg MODEL DROPOUT-P: 0.0 April 16, 2024 11:44:23286 INFO cfg MODEL LABEL_SMOOTING: 0.1 April 16, 2024 11:44:23287 INFO cfg MODEL IF-DIST: True April 16, 2024 11:44:23287 INFO Cfg OPTIM=edit() April 16, 2024 11:44:23287 INFO cfg OPTIM BATCH-SIZE-PER_GPU: 12 April 16, 2024 11:44:23287 INFO cfg OPTIM NUM-EPOCHS: 36 April 16, 2024 11:44:23288 INFO cfg OPTIM Optimizer: sgd April 16, 2024 11:44:23288 INFO cfg OPTIM LR-PER-SAMPLE: 0.02 April 16, 2024 11:44:23288 INFO cfg OPTIM WEIGHT-DECAY: 0.0001 April 16, 2024 11:44:23288 INFO cfg OPTIM MOmentUM: 0.9 April 16, 2024 11:44:23288 INFO cfg OPTIM NESTEROV: True April 16, 2024 11:44:23289 INFO cfg OPTIM GRAD.NORL_CLIP: 10 April 16, 2024 11:44:23289 INFO cfg OPTIM SCHEDULER: linear_warmup_with_codecay April 16, 2024 11:44:23289 INFO cfg OPTIM WARMUP_EPOCH: 1 April 16, 2024 11:44:23289 INFO cfg OPTIM LR: 0.24 April 16, 2024 11:44:23289 INFO cfg TAG: minkunet_mk34_cr10 April 16, 2024 11:44:23290 INFO cfg EXP_GROUP-PATH: voxel/semantics_kitti The total sample is 0 Fine grained version!!! Traceback (most recent call last): File "train. py", line 573, in Main() File "train. py", line 548, in main Trainer=Trainer (args, cfgs) File "train. py", line 189, in init Optim_cfg=cfgs OPTIM, File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 127, in build_scheduler Lr_lambda=lambda x: linear_warmup with cosdecoy (x, warmup steps, total steps), File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 203, in init Super (LambdaLR, self) Init (optimizer, last_epoch, verbose) File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 77, in init Self. step() File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 152, in step Values=self. get_lr() File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 251, in get_lr For lmbda, base_lr in zip (self. lr_lambdas, self. base_lrs)] File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 251, in For lmbda, base_lr in zip (self. lr_lambdas, self. base_lrs)] File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 127, in Lr_lambda=lambda x: linear_warmup with cosdecoy (x, warmup steps, total steps), File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 77, in linear_warmup_with cosdecay Ratio=(cur_step - warmup.steps)/total_steps ZeroDivisionError: division by zero

The code for running the training file is: python train.py --cfg_file tools/cfgs/voxel/semantic_kitti/minkunet_mk34_cr10.yaml

I would like to ask, what are the possible reasons for this error to occur? I have checked my dataset path and there should be no problem, but I am not sure what other reasons are. Thank you

BurtonMan commented 7 months ago

代码里的数据集目录格式和Cylinder3D一样,数据集目录格式调整下就可以了。下面是代码,看看就明白了 self.annos += absoluteFilePaths('/'.join([self.root_path, str(seq).zfill(2), 'velodyne'])) self.annos[index].replace('velodyne', 'labels')[:-3] + 'label', dtype=np.uint32 ).reshape((-1, 1))

xiaoman-liu commented 6 months ago

Hello! I encountered an error while running the training file: (pcseg) root@9e1fedc6ce4f :/workspace/data/OpenPCSeg master # Python train. py -- cfg_file tools/cfgs/voxel/semantic_kitti/inkunt_mk34_cr10. yaml 2024-04-16 11:44:23277 INFO ** Start logging** 2024-04-16 11:44:23277 INFO CUDA_VISIBLE-DEVICES=ALL 2024-04-16 11:44:23277 INFO cfg_file tools/cfgs/voxel/semantic_kitti/inkunt_mk34_cr10. yaml 2024-04-16 11:44:23278 INFO extra_tag default April 16, 2024 11:44:23278 INFO set_cfgs None 2024-04-16 11:44:23278 INFO fix_random_seed False April 16, 2024 11:44:23278 INFO batch_size 12 April 16, 2024 11:44:23279 INFO epochs 36 2024-04-16 11:44:23279 INFO sync_bn False April 16, 2024 11:44:23279 INFO ckp None April 16, 2024 11:44:23279 INFO pretrained model None April 16, 2024 11:44:23280 INFO amp False April 16, 2024 11:44:23280 INFO ckp_save-interval 1 April 16, 2024 11:44:23280 INFO max_ckp_savenum 30 2024-04-16 11:44:23280 INFO merge-all_iters.to_one_epoch False April 16, 2024 11:44:23281 INFO eval False April 16, 2024 11:44:23281 INFO eval_interval 50 April 16, 2024 11:44:23281 INFO workers 5 2024-04-16 11:44:23281 INFO local_rank 0 April 16, 2024 11:44:23282 INFO launcher none April 16, 2024 11:44:23282 INFO tcd_port 18888 April 16, 2024 11:44:23282 INFO cfg ROOTDIR:/workspace/data/OpenPCSeg master April 16, 2024 11:44:23282 INFO cfg LOCAL-RANK: 0 April 16, 2024 11:44:23282 INFO cfg MODALITY: voxel April 16, 2024 11:44:23283 INFO Cfg DATA=edit() April 16, 2024 11:44:23283 INFO cfg DATA DATASET: Semantickitti April 16, 2024 11:44:23283 INFO cfg DATA PETRELOSS-CONFIG: None April 16, 2024 11:44:23283 INFO cfg DATA DATA-PATH:/workspace/data/SemanticKITTI/dataset/ April 16, 2024 11:44:23284 INFO cfg DATA VOXEL_SIZE: 0.05 April 16, 2024 11:44:23284 INFO cfg DATA AUGMENT: GlobalAugust LP April 16, 2024 11:44:23284 INFO cfg DATA NUM-POINTS: 1000000 April 16, 2024 11:44:23284 INFO cfg DATA TRAINVAL: False April 16, 2024 11:44:23284 INFO cfg DATA TTA: False April 16, 2024 11:44:23285 INFO Cfg MODEL=edit() April 16, 2024 11:44:23285 INFO cfg MODEL NAME: MinkUNet April 16, 2024 11:44:23285 INFO cfg MODEL IGNORE-LABEL: 0 April 16, 2024 11:44:23285 INFO cfg MODEL IN-FEATUREDIM: 4 April 16, 2024 11:44:23285 INFO cfg MODEL Block: ResBlock April 16, 2024 11:44:23286 INFO cfg MODEL NUM-LAYER: [2, 3, 4, 6, 2, 2, 2, 2, 2] April 16, 2024 11:44:23286 INFO cfg MODEL Plans: [32, 32, 64, 128, 256, 256, 128, 96, 96] April 16, 2024 11:44:23286 INFO cfg MODEL.cr: 1.0 April 16, 2024 11:44:23286 INFO cfg MODEL DROPOUT-P: 0.0 April 16, 2024 11:44:23286 INFO cfg MODEL LABEL_SMOOTING: 0.1 April 16, 2024 11:44:23287 INFO cfg MODEL IF-DIST: True April 16, 2024 11:44:23287 INFO Cfg OPTIM=edit() April 16, 2024 11:44:23287 INFO cfg OPTIM BATCH-SIZE-PER_GPU: 12 April 16, 2024 11:44:23287 INFO cfg OPTIM NUM-EPOCHS: 36 April 16, 2024 11:44:23288 INFO cfg OPTIM Optimizer: sgd April 16, 2024 11:44:23288 INFO cfg OPTIM LR-PER-SAMPLE: 0.02 April 16, 2024 11:44:23288 INFO cfg OPTIM WEIGHT-DECAY: 0.0001 April 16, 2024 11:44:23288 INFO cfg OPTIM MOmentUM: 0.9 April 16, 2024 11:44:23288 INFO cfg OPTIM NESTEROV: True April 16, 2024 11:44:23289 INFO cfg OPTIM GRAD.NORL_CLIP: 10 April 16, 2024 11:44:23289 INFO cfg OPTIM SCHEDULER: linear_warmup_with_codecay April 16, 2024 11:44:23289 INFO cfg OPTIM WARMUP_EPOCH: 1 April 16, 2024 11:44:23289 INFO cfg OPTIM LR: 0.24 April 16, 2024 11:44:23289 INFO cfg TAG: minkunet_mk34_cr10 April 16, 2024 11:44:23290 INFO cfg EXP_GROUP-PATH: voxel/semantics_kitti The total sample is 0 Fine grained version!!! Traceback (most recent call last): File "train. py", line 573, in Main() File "train. py", line 548, in main Trainer=Trainer (args, cfgs) File "train. py", line 189, in init Optim_cfg=cfgs OPTIM, File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 127, in build_scheduler Lr_lambda=lambda x: linear_warmup with cosdecoy (x, warmup steps, total steps), File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 203, in init Super (LambdaLR, self) Init (optimizer, last_epoch, verbose) File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 77, in init Self. step() File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 152, in step Values=self. get_lr() File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 251, in get_lr For lmbda, base_lr in zip (self. lr_lambdas, self. base_lrs)] File "/root/miniconde3/envs/pcseg/lib/python3.7/site packages/torch/optim/lr_scheduler. py", line 251, in For lmbda, base_lr in zip (self. lr_lambdas, self. base_lrs)] File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 127, in Lr_lambda=lambda x: linear_warmup with cosdecoy (x, warmup steps, total steps), File "/workspace/data/OpenPCSeg master/pcseg/optim/ init. py", line 77, in linear_warmup_with cosdecay Ratio=(cur_step - warmup.steps)/total_steps ZeroDivisionError: division by zero

The code for running the training file is: python train.py --cfg_file tools/cfgs/voxel/semantic_kitti/minkunet_mk34_cr10.yaml

I would like to ask, what are the possible reasons for this error to occur? I have checked my dataset path and there should be no problem, but I am not sure what other reasons are. Thank you

Hi I also met this problem. I followed the data_prerare.md but the data structure is not the same as it in the code └── SemanticKitti
└── dataset ├── velodyne <- contains the .bin files; a .bin file contains the points in a point cloud │ └── 00 │ └── ··· │ └── 21 ├── labels <- contains the .label files; a .label file contains the labels of the points in a point cloud │ └── 00 │ └── ··· │ └── 10 ├── calib │ └── 00 │ └── ··· │ └── 21 └── semantic-kitti.yaml

Actually in the code the data is organized as this

SemanticKITTI\dataset\00\velodyne

in this function, we cannot read any data, so the error will occur

def absoluteFilePaths(directory):
    for dirpath, _, filenames in os.walk(directory):
        for f in filenames:
            yield os.path.abspath(os.path.join(dirpath, f))
TengfeiZeng commented 4 months ago
aths(directory):
    for dirpath, _, filenames in os.walk(directory):
        for f in filenames:
            yield os.path.abspath(os.path.join(dirpath, f)

Hello, may I ask how you finally solved it