PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.69k stars 1.68k forks source link

ubuntu上报错label expected >= 0 and < 2, or == 255, but got 89,但是模型配置和数据集都没有问题。 #3770

Closed lingdujunshang closed 2 months ago

lingdujunshang commented 3 months ago

问题确认 Search before asking

Bug描述 Describe the Bug

paddle的老用户了,直接按照教程中‘快速开始’部分的教程开始跑,模型配置文件咱直接用默认的,,数据也是完全使用公开的数据集optic_disc_seg,configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml中的设置完全按照教程中的来的,

第一次ubuntu尝试,报错数据如下: (paddle_seg) xuqing@dell-PowerEdge-R740:~/projects/paddle_seg/PaddleSeg$ python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output 2024-08-02 10:12:37 [WARNING] Add the num_classes in train_dataset and val_dataset config to model config. We suggest you manually set num_classes in model config. 2024-08-02 10:12:38 [INFO]
------------Environment Information------------- platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35 Python: 3.9.19 (main, Apr 6 2024, 17:57:55) [GCC 11.4.0] Paddle compiled with cuda: True NVCC: Build cuda_11.8.r11.8/compiler.31833905_0 cudnn: 8.6 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce', 'GPU 1: NVIDIA GeForce'] GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PaddleSeg: 0.0.0.dev0 PaddlePaddle: 2.6.1 OpenCV: 4.10.0

2024-08-02 10:12:38 [INFO]
---------------Config Information--------------- batch_size: 4 iters: 1000 train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:

2024-08-02 10:12:38 [INFO] Set device: gpu 2024-08-02 10:12:38 [INFO] Use the following config to build model model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg W0802 10:12:38.203576 221128 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 11.8 W0802 10:12:38.203653 221128 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6. 2024-08-02 10:12:38 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz 2024-08-02 10:12:38 [INFO] There are 265/265 variables loaded into STDCNet. 2024-08-02 10:12:38 [INFO] Use the following config to build train_dataset train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:

行,咱就是说,除了ubuntu,本地电脑也不是不能用,直接在windows上跑,竟然没想到啊,它完全可以跑的起来,返回如下: (paddleseg) D:\env\paddleseg\PaddleSeg>python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output 2024-08-02 10:33:12 [WARNING] Add the num_classes in train_dataset and val_dataset config to model config. We suggest you manually set num_classes in model config. 2024-08-02 10:33:12 [INFO] ------------Environment Information------------- platform: Windows-10-10.0.19041-SP0 Python: 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)] Paddle compiled with cuda: True NVCC: Build cuda_11.7.r11.7/compiler.31294372_0 cudnn: 8.4 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce'] GCC: gcc (MinGW-W64 x86_64-posix-seh, built by Brecht Sanders) 11.3.0 PaddleSeg: 2.9.0 PaddlePaddle: 2.5.2 OpenCV: 4.8.1

2024-08-02 10:33:12 [INFO] ---------------Config Information--------------- batch_size: 4 iters: 1000 train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:

2024-08-02 10:33:12 [INFO] Set device: gpu 2024-08-02 10:33:12 [INFO] Use the following config to build model model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg W0802 10:33:12.746259 17620 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.7 W0802 10:33:12.746259 17620 gpu_resources.cc:149] device: 0, cuDNN Version: 8.4. 2024-08-02 10:33:13 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz Connecting to https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz Downloading PP_STDCNet2.tar.gz [==================================================] 100.00% Uncompress PP_STDCNet2.tar.gz [==================================================] 100.00% 2024-08-02 10:33:15 [INFO] There are 265/265 variables loaded into STDCNet. 2024-08-02 10:33:15 [INFO] Use the following config to build train_dataset train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:

其他自定义数据(coco的暂时没尝试,就标注是png,图像是jpg的这种普通图像分割的数据类别)也多方尝试,也得到了同样的结果,综上所述,我有两个怀疑: 1,旧版本的paddleseg跑pp_liteseg的模型,不会出问题,新版本的paddleseg(8月1号git clone下来的这个版本)会出现读取Annotations中图像数据的时候有问题,要么是读取灰度图成了三通道图,要么是某个图像包在windows上和ubuntu上的返回不一致 2,windows上的paddleseg和ubuntu上的paddleseg不一样,斗胆猜测是读取数据集这块有个什么问题,导致ubuntu上无法将设置的labels数据和png上的标注的数据对应起来,,

复现环境 Environment

platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35 Python: 3.9.19 (main, Apr 6 2024, 17:57:55) [GCC 11.4.0] Paddle compiled with cuda: True NVCC: Build cuda_11.8.r11.8/compiler.31833905_0 cudnn: 8.6 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce', 'GPU 1: NVIDIA GeForce'] GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PaddleSeg: 0.0.0.dev0 PaddlePaddle: 2.6.1 OpenCV: 4.10.0

备注:ubuntu22.04 按照教程安装并且可以通过运行检查(sh tests/install/check_predict.sh)

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

liuhongen1234567 commented 3 months ago

您好,我这边在 ubuntu 上 使用paddle2.6 训练并没有遇到上述问题。 您可以参考 docs/quick_start.md 重新下载数据试试。这个报错看上去像是数据集的问题

TingquanGao commented 3 months ago

The issue has no response for a long time and will be closed. You can reopen or new another issue if are still confused.


From Bot

TingquanGao commented 2 months ago

The issue has no response for a long time and will be closed. You can reopen or new another issue if are still confused.


From Bot