Closed huanghuang113 closed 10 months ago
This indicate all training samples doesnot valid after crop (see here) It is usually depend on the point density. Please check it on your own data.
Thanks for the suggestion, I will try it!
This indicate all training samples doesnot valid after crop (see here) It is usually depend on the point density. Please check it on your own data.
But I use the default config about S3dis , it also have the same problem.May I change it lower?
Yes. I think you may change the min_npoint to lower value or increase the batch size.
Dear Thang Vu, Thank you very much for your contribution in 3D point cloud instance segmentation. I am having some problems with your model and would like you to give me some advice. Data description: my data is a set of plant point cloud with only two semantic classes, leaf and stalk, I have some problems while training after processing my dataset according to the preprocessing of stpls3D dataset: 2023-08-01 14:09:55,900 - INFO - Config: model: channels: 32 num_blocks: 3 semantic_classes: 2 instance_classes: 2 sem2ins_classes: [] semantic_only: True semantic_weight: [1.0, 1.0] # 根据类别数进行设置,这里是两个类别,权重均为1.0 with_coords: False ignore_label: -100 grouping_cfg: score_thr: 0.2 radius: 0.04 # 降低体素尺寸,以适应您的数据集大小 mean_active: 300 class_numpoint_mean: [1.0, 13624.0] # 指定每个类别的体素平均点数,这里是两个类别,设置为500 npoint_thr: 0.05 ignore_classes: [] instance_voxel_cfg: scale: 50 # 降低体素尺寸,以适应您的数据集大小 spatial_shape: 20 train_cfg: max_proposal_num: 200 pos_iou_thr: 0.5 test_cfg: x4_split: False cls_score_thr: 0.001 mask_score_thr: -0.5 min_npoint: 100 eval_tasks: ['semantic']
data: train: type: 'plant' data_root: 'dataset/plant' prefix: 'train' suffix: '.pth' training: True repeat: 5 # 重复的次数 voxel_cfg: scale: 50 # 降低体素尺寸,以适应您的数据集大小 spatial_shape: [128,512] # 调整体素网格大小 max_npoint: 250000 # 调整体素中最大点数 min_npoint: 5000 # 调整体素中最小点数 test: type: 'plant' data_root: 'dataset/plant' prefix: 'val_250m' suffix: '.pth' training: False voxel_cfg: scale: 50 # 降低体素尺寸,以适应您的数据集大小 spatial_shape: [128, 512] # 调整体素网格大小 max_npoint: 250000 # 调整体素中最大点数 min_npoint: 5000 # 调整体素中最小点数
dataloader: train: batch_size: 2 num_workers: 4 test: batch_size: 1 num_workers: 1
optimizer: type: 'Adam' lr: 0.004
fp16: False epochs: 5 step_epoch: 0 save_freq: 2 pretrain: '' work_dir: ''
2023-08-01 14:09:55,900 - INFO - Distributed: False 2023-08-01 14:09:55,900 - INFO - Mix precision training: False 2023-08-01 14:10:01,103 - INFO - Load train dataset: 1720 scans 2023-08-01 14:10:01,103 - INFO - Load test dataset: 86 scans 2023-08-01 14:10:01,104 - INFO - Training 2023-08-01 14:10:01,695 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:03,778 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:04,734 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:05,422 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:05,779 - INFO - Epoch [1/5][10/860] lr: 0.004, eta: 0:33:24, mem: 449, data_time: 0.00, iter_time: 0.05, semantic_loss: 0.5232, offset_loss: 0.1780, loss: 0.7011 2023-08-01 14:10:05,920 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:07,859 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:09,642 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:09,860 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:10,891 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:11,196 - INFO - Epoch [1/5][20/860] lr: 0.004, eta: 0:35:39, mem: 449, data_time: 1.11, iter_time: 1.30, semantic_loss: 0.3730, offset_loss: 0.2148, loss: 0.5878 2023-08-01 14:10:15,136 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:15,641 - INFO - Epoch [1/5][30/860] lr: 0.004, eta: 0:34:20, mem: 449, data_time: 0.00, iter_time: 0.13, semantic_loss: 0.1968, offset_loss: 0.2220, loss: 0.4187 2023-08-01 14:10:15,793 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:16,407 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:20,568 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:20,630 - INFO - batch is truncated from size 2 to 1 2023-08-01 14:10:20,661 - INFO - Epoch [1/5][40/860] lr: 0.004, eta: 0:34:40, mem: 449, data_time: 3.90, iter_time: 3.96, semantic_loss: 0.4106, offset_loss: 0.2034, loss: 0.6140 2023-08-01 14:10:20,688 - INFO - batch is truncated from size 2 to 1 Traceback (most recent call last): File "/GitProject/SoftGroup/tools/train.py", line 207, in
main()
File "/GitProject/SoftGroup/tools/train.py", line 200, in main
train(epoch, model, optimizer, scaler, train_loader, cfg, logger, writer)
File "/GitProject/SoftGroup/tools/train.py", line 44, in train
for i, batch in enumerate(train_loader, start=1):
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
return self._process_data(data)
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/hhroot/anaconda3/envs/softg/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/GitProject/SoftGroup/softgroup/data/custom.py", line 222, in collate_fn
assert batch_id > 0, 'empty batch'
AssertionError: empty batch
ERROR conda.cli.main_run:execute(47):
conda run python /GitProject/SoftGroup/tools/train.py ../configs/softgroup/softgroup_my_dataset_backbone.yaml
failed. (See above for error)I've tried to change my parameters, but I've never been able to solve the problem, so I hope you can give me some advice!