PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.81k stars 2.89k forks source link

ppyolo_tiny_640e报错 #4180

Closed yyyssq closed 2 years ago

yyyssq commented 3 years ago

PaddleDetection team appreciate any suggestion or problem you delivered~

Checklist:

  1. 查找历史相关issue寻求解答/I have searched related issues but cannot get the expected help.
  2. 翻阅FAQ /I have read the FAQ documentation but cannot get the expected help.

描述问题/Describe the bug

A clear and concise description of what the bug is. 使用ppyolo_tiny_640e在进行第一轮训练结束后的验证部分出错

复现/Reproduction

ERROR:root:DataLoader reader thread raised an exception!
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 391, in _thread_loop
    batch = self._get_data()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 505, in _get_data
    batch.reraise()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/worker.py", line 168, in reraise
    raise self.exc_type(msg)
ValueError: DataLoader worker(2) caught ValueError with message:
Traceback (most recent call last):
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/worker.py", line 320, in _worker_loop
    batch = fetcher.fetch(indices)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/fetcher.py", line 117, in fetch
    data = self.collate_fn(data)
  File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 91, in __call__
    batch_data = default_collate_fn(data)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 70, in default_collate_fn
    for key in sample
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 70, in <dictcomp>
    for key in sample
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 58, in default_collate_fn
    batch = np.stack(batch, axis=0)
  File "<__array_function__ internals>", line 6, in stack
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/numpy/core/shape_base.py", line 427, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

[09/14 22:01:33] ppdet.engine INFO: Eval iter: 0
Traceback (most recent call last):
  File "./tools/train.py", line 140, in <module>
    main()
  File "./tools/train.py", line 136, in main
    run(FLAGS, cfg)
  File "./tools/train.py", line 109, in run
    trainer.train(FLAGS.eval)
  File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 415, in train
    self._eval_with_loader(self._eval_loader)
  File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 429, in _eval_with_loader
    for step_id, data in enumerate(loader):
  File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 209, in __next__
    return next(self.loader)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 565, in __next__
    data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
  [Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)
  1. 您使用的命令是?/What command or script did you run?
python ./tools/train.py -c ./configs/ppyolo/ppyolo_tiny_650e_coco.yml --eval --use_vdl USE_VDL --vdl_log_dir ./log/pppyolo
  1. 您是否更改过代码或配置文件?您是否理解您所更改的内容?还请您提供所更改的部分代码。/Did you make any modifications on the code or config? Did you understand what you have modified? Please provide the codes that you modified.

将读取coco数据集改为读取voc格式的,其他是超参数调整

  1. 您使用的数据集是?/What dataset did you use?

自定义数据

  1. 请提供您出现的报错信息及相关log。/Please provide the error messages or relevant log information.
[09/14 21:50:01] ppdet.utils.checkpoint INFO: Save checkpoint: output/ppyolo_tiny_650e_coco
ERROR:root:DataLoader reader thread raised an exception!
Exception in thread Thread-4:
Traceback (most recent call last):
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 391, in _thread_loop
    batch = self._get_data()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 505, in _get_data
    batch.reraise()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/worker.py", line 168, in reraise
    raise self.exc_type(msg)
ValueError: DataLoader worker(2) caught ValueError with message:
Traceback (most recent call last):
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/worker.py", line 320, in _worker_loop
    batch = fetcher.fetch(indices)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/fetcher.py", line 117, in fetch
    data = self.collate_fn(data)
  File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 91, in __call__
    batch_data = default_collate_fn(data)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 70, in default_collate_fn
    for key in sample
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 70, in <dictcomp>
    for key in sample
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/collate.py", line 58, in default_collate_fn
    batch = np.stack(batch, axis=0)
  File "<__array_function__ internals>", line 6, in stack
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/numpy/core/shape_base.py", line 427, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

[09/14 21:50:02] ppdet.engine INFO: Eval iter: 0
Traceback (most recent call last):
  File "./tools/train.py", line 140, in <module>
    main()
  File "./tools/train.py", line 136, in main
    run(FLAGS, cfg)
  File "./tools/train.py", line 109, in run
    trainer.train(FLAGS.eval)
  File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 415, in train
    self._eval_with_loader(self._eval_loader)
  File "/home/aistudio/PaddleDetection/ppdet/engine/trainer.py", line 429, in _eval_with_loader
    for step_id, data in enumerate(loader):
  File "/home/aistudio/PaddleDetection/ppdet/data/reader.py", line 209, in __next__
    return next(self.loader)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 565, in __next__
    data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
  [Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)

环境/Environment

  1. 请提供您使用的Paddle和PaddleDetection的版本号/Please provide the version of Paddle and PaddleDetection you use:

2.1.2 develop

  1. 如您在使用PaddleDetection的同时还在使用其他产品,如PaddleServing、PaddleInference等,请您提供其版本号/ Please provide the version of any other related tools/products used, such as the version of PaddleServing and etc:

  2. 请提供您使用的操作系统信息,如Linux/Windows/MacOS /Please provide the OS information, e.g., Linux:

linux

  1. 请问您使用的Python版本是?/ Please provide the version of Python you used.

3.7

  1. 请问您使用的CUDA/cuDNN的版本号是?/ Please provide the version of CUDA/cuDNN you used.

如果您的issue是关于安装或环境,您可以先查询安装文档尝试解决~

If your issue looks like an installation issue / environment issue, please first try to solve it yourself with the instructions in https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL.md

qingqing01 commented 3 years ago

如果是VOC的配置建议使用:

https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.2/configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml

和COCO有个不同的配置

https://github.com/PaddlePaddle/PaddleDetection/blob/40a8249c9ad143962a4bf9dbead510e1cb56d1a3/configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml#L16-L20

paddle-bot-old[bot] commented 2 years ago

Since this issue has not been updated for more than three months, it will be closed, if it is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于该问题超过三个月未更新,将会被关闭,若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。