yeyupiaoling / PP-YOLOE

PaddlePaddle实现的目标检测模型PP-YOLOE
Apache License 2.0
110 stars 32 forks source link

训练过程中出现问题 #15

Closed xzdong-2019 closed 1 year ago

xzdong-2019 commented 1 year ago

[2023-06-08 07:37:55.523808 INFO ] trainer:train:310 - Test epoch: 99, time/epoch: 0:04:26.419727, best_mAP: 0.83875, mAP: 0.81693 [2023-06-08 07:37:55.524025 INFO ] trainer:train:312 - ====================================================================== [2023-06-08 07:37:56.038528 INFO ] trainer:save_checkpoint:196 - 已保存模型:models/PPYOLOE_M/epoch_99 Traceback (most recent call last): File "train.py", line 44, in trainer.train(num_epoch=args.num_epoch, File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/trainer.py", line 300, in train self.__train_epoch(max_epoch=num_epoch, epoch_id=epoch_id, log_interval=log_interval, local_rank=local_rank, writer=writer) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/trainer.py", line 204, in train_epoch output = self.model(data) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(*inputs, kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/meta_arch.py", line 53, in forward out = self.get_loss() File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/yolo.py", line 46, in get_loss return self._forward() File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/yolo.py", line 36, in _forward yolo_losses = self.yolo_head(neck_feats, self.inputs) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(*inputs, *kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 202, in forward return self.forward_train(feats, targets) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 142, in forward_train return self.get_loss([ File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 308, in get_loss self.assigner( File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(inputs, kwargs) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/base.py", line 375, in _decorate_function return func(args, **kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/task_aligned_assigner.py", line 93, in forward ious = iou_similarity(gt_bboxes, pred_bboxes) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/bbox_utils.py", line 42, in iou_similarity x2y2 = paddle.minimum(px2y2, gx2y2) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/tensor/math.py", line 1008, in minimum return _C_ops.minimum(x, y) ValueError: (InvalidArgument) The 3-th dimension of input tensor is expected to be equal with the 3-th dimension of outputtensor 2 or 1, but received 0. (at /paddle/paddle/phi/kernels/funcs/broadcast_function.h:77)

yeyupiaoling commented 1 year ago

这个是训练过程中出现的吗?前面有没有进场训练

xzdong-2019 commented 1 year ago

这是在训练过程中出现的,已经训练了100轮

yeyupiaoling commented 1 year ago

看报错是纬度问题,但是如果前面正常训练过的话,数据还有模型,其他的应该不会有问题。损失值有什么变化吗?

xzdong-2019 commented 1 year ago

对,报错是维度问题,损失值也下降了;但是感觉能训练应该之前的数据 应该没有问题~,就不知道哪里出错了,每次到100轮的时候就终止了

xzdong-2019 commented 1 year ago

再次启动时,还是出现以下问题: declare_namespace(pkg) ----------- Configuration Arguments ----------- batch_size: 4 eval_anno_path: dataset/eval.json image_dir: dataset/ learning_rate: 0.000125 log_interval: 500 model_type: M num_classes: 4 num_epoch: 1000 num_workers: 8 pretrained_model: None resume_model: None save_model_path: models/ train_anno_path: dataset/train.json use_gpu: True use_random_crop: True use_random_distort: True use_random_expand: True use_random_flip: True

True loading annotations into memory... Done (t=0.01s) creating index... index created! loading annotations into memory... Done (t=0.01s) creating index... index created! W0609 16:01:00.625028 6134 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 11.1, Runtime API Version: 11.1 W0609 16:01:00.627727 6134 gpu_resources.cc:91] device: 0, cuDNN Version: 8.1. W0609 16:01:00.627743 6134 gpu_resources.cc:117] WARNING: device: 0. The installed Paddle is compiled with CUDA 11.2, but CUDA runtime version in your machine is 11.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDA version. loading annotations into memory... Done (t=0.01s) creating index... index created! [2023-06-09 16:01:02.739166 INFO ] trainer:train:285 - 训练数据:720 /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.0.blocks.0.conv2.alpha. backbone.stages.0.blocks.0.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.0.blocks.1.conv2.alpha. backbone.stages.0.blocks.1.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.1.blocks.0.conv2.alpha. backbone.stages.1.blocks.0.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.1.blocks.1.conv2.alpha. backbone.stages.1.blocks.1.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.1.blocks.2.conv2.alpha. backbone.stages.1.blocks.2.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.1.blocks.3.conv2.alpha. backbone.stages.1.blocks.3.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.2.blocks.0.conv2.alpha. backbone.stages.2.blocks.0.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.2.blocks.1.conv2.alpha. backbone.stages.2.blocks.1.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.2.blocks.2.conv2.alpha. backbone.stages.2.blocks.2.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.2.blocks.3.conv2.alpha. backbone.stages.2.blocks.3.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.3.blocks.0.conv2.alpha. backbone.stages.3.blocks.0.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for backbone.stages.3.blocks.1.conv2.alpha. backbone.stages.3.blocks.1.conv2.alpha is not found in the provided dict. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.0.weight. yolo_head.pred_cls.0.weight receives a shape [365, 576, 3, 3], but the expected shape is [4, 576, 3, 3]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.0.bias. yolo_head.pred_cls.0.bias receives a shape [365], but the expected shape is [4]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.1.weight. yolo_head.pred_cls.1.weight receives a shape [365, 288, 3, 3], but the expected shape is [4, 288, 3, 3]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.1.bias. yolo_head.pred_cls.1.bias receives a shape [365], but the expected shape is [4]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.2.weight. yolo_head.pred_cls.2.weight receives a shape [365, 144, 3, 3], but the expected shape is [4, 144, 3, 3]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py:1652: UserWarning: Skip loading for yolo_head.pred_cls.2.bias. yolo_head.pred_cls.2.bias receives a shape [365], but the expected shape is [4]. warnings.warn(("Skip loading for {}. ".format(key) + str(err))) [2023-06-09 16:01:02.990542 INFO ] trainer:load_pretrained:151 - 成功加载预训练模型:pretrained_models/ppyoloe_crn_m_obj365_pretrained.pdparams [2023-06-09 16:01:03.230492 INFO ] trainer:load_checkpoint:169 - 成功恢复模型参数和优化方法参数:models/PPYOLOE_M/last_model W0609 16:01:06.713155 6134 gpu_resources.cc:217] WARNING: device: . The installed Paddle is compiled with CUDNN 8.2, but CUDNN version in your machine is 8.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. /data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/nn/layer/norm.py:712: UserWarning: When training, we now always track global mean and variance. warnings.warn( Traceback (most recent call last): File "train.py", line 44, in trainer.train(num_epoch=args.num_epoch, File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/trainer.py", line 300, in train self.train_epoch(max_epoch=num_epoch, epoch_id=epoch_id, log_interval=log_interval, local_rank=local_rank, writer=writer) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/trainer.py", line 204, in train_epoch output = self.model(data) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call__ return self.forward(*inputs, **kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/meta_arch.py", line 53, in forward out = self.get_loss() File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/yolo.py", line 46, in get_loss return self._forward() File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/yolo.py", line 36, in _forward yolo_losses = self.yolo_head(neck_feats, self.inputs) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call return self.forward(*inputs, **kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 202, in forward return self.forward_train(feats, targets) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 142, in forward_train return self.get_loss([ File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/ppyoloe_head.py", line 308, in get_loss self.assigner( File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1012, in call__ return self.forward(*inputs, kwargs) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/fluid/dygraph/base.py", line 375, in _decorate_function return func(args, kwargs) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/task_aligned_assigner.py", line 93, in forward ious = iou_similarity(gt_bboxes, pred_bboxes) File "/data/dongxz/competion/cv/PP-YOLOE/ppyoloe/model/bbox_utils.py", line 42, in iou_similarity x2y2 = paddle.minimum(px2y2, gx2y2) File "/data/anaconda3/envs/dongxz_paddlepaddle/lib/python3.8/site-packages/paddle/tensor/math.py", line 1008, in minimum return _C_ops.minimum(x, y) ValueError: (InvalidArgument) The 3-th dimension of input tensor is expected to be equal with the 3-th dimension of output tensor 2 or 1, but received 0. (at /paddle/paddle/phi/kernels/funcs/broadcast_function.h:77)

yeyupiaoling commented 1 year ago

你看看100轮是不是有什么特别的操作,我忘记了。