Closed guanshanjushi closed 1 year ago
使用自己的数据集进行训练,修改了num_class,同时修改了batch_size=2,worker_num=2: worker_num: 2 TrainReader: sample_transforms:
EvalReader: sample_transforms:
TestReader: inputs_def: image_shape: [3, 640, 640] sample_transforms:
找到原因了,由于我反复测试,主要是因为paddle的cuda版本问题,最新的rtdetr采用cuda10.2版本的paddle训练时会出现以上问题,但采用cuda11.1版本以后就不会出现以上问题,同时要注意cudnn的版本要和paddle要求的版本一致即可。 愿所有人都不被环境配置所困扰
问题确认 Search before asking
Bug组件 Bug Component
Training
Bug描述 Describe the Bug
我在训练rtdetr的时候出现一下问题: INFO 2023-04-25 14:15:35,478 utils.py:148] Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8. loading annotations into memory... Done (t=0.58s) creating index... index created! [04/25 14:15:37] ppdet.data.source.coco INFO: Load [4849 samples valid, 1 samples invalid] in file /home/wxp/wxp_dataset/newcityevent/验证训练集/dataset_科技部课题/train/coco_train.json. W0425 14:15:37.660985 65296 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 10.2 W0425 14:15:37.663920 65296 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. [04/25 14:15:39] ppdet.utils.checkpoint INFO: ['fc.bias', 'fc.weight', 'last_conv.weight'] in pretrained weight is not used in the model, and its will not be loaded [04/25 14:15:39] ppdet.utils.checkpoint INFO: Finish loading model weights: /home/wxp/project_wxp/github/YOLO/PaddleDetection/pretrain_weights/PPHGNetV2_L_ssld_pretrained.pdparams Traceback (most recent call last): File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/tools/train.py", line 204, in
main()
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/tools/train.py", line 200, in main
run(FLAGS, cfg)
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/tools/train.py", line 153, in run
trainer.train(FLAGS.eval)
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/engine/trainer.py", line 542, in train
outputs = model(data)
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call
return self._dygraph_call_func(*inputs, kwargs)
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, *kwargs)
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/modeling/architectures/meta_arch.py", line 60, in forward
out = self.get_loss()
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/modeling/architectures/detr.py", line 113, in get_loss
return self._forward()
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/modeling/architectures/detr.py", line 87, in _forward
out_transformer = self.transformer(body_feats, pad_mask, self.inputs)
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call
return self._dygraph_call_func(inputs, kwargs)
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, *kwargs)
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/modeling/transformers/rtdetr_transformer.py", line 442, in forward
get_contrastive_denoising_training_group(gt_meta,
File "/home/wxp/project_wxp/github/YOLO/PaddleDetection/ppdet/modeling/transformers/utils.py", line 258, in get_contrastive_denoising_training_group
dn_positive_idx = paddle.split(dn_positive_idx,
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/tensor/manipulation.py", line 954, in split
return paddle.fluid.layers.split(
File "/home/wxp/anaconda3/lib/python3.9/site-packages/paddle/fluid/layers/nn.py", line 5097, in split
_C_ops.split(input, out, attrs)
ValueError: (InvalidArgument) Sum of Attr(num_or_sections) must be equal to the input's size along the split dimension. But received Attr(num_or_sections) = [80, 52], input(X)'s shape = [1638400], Attr(dim) = 0.
[Hint: Expected sum_of_section == input_axis_dim, but received sum_of_section:132 != input_axis_dim:1638400.] (at /paddle/paddle/fluid/operators/split_op.h:100)
[operator < split > error]
复现环境 Environment
Bug描述确认 Bug description confirmation
是否愿意提交PR? Are you willing to submit a PR?