Closed lingdujunshang closed 2 months ago
您好,我这边在 ubuntu 上 使用paddle2.6 训练并没有遇到上述问题。 您可以参考 docs/quick_start.md 重新下载数据试试。这个报错看上去像是数据集的问题
The issue has no response for a long time and will be closed. You can reopen or new another issue if are still confused.
From Bot
The issue has no response for a long time and will be closed. You can reopen or new another issue if are still confused.
From Bot
问题确认 Search before asking
Bug描述 Describe the Bug
paddle的老用户了,直接按照教程中‘快速开始’部分的教程开始跑,模型配置文件咱直接用默认的,,数据也是完全使用公开的数据集optic_disc_seg,configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml中的设置完全按照教程中的来的,
第一次ubuntu尝试,报错数据如下: (paddle_seg) xuqing@dell-PowerEdge-R740:~/projects/paddle_seg/PaddleSeg$ python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output 2024-08-02 10:12:37 [WARNING] Add the
num_classes
in train_dataset and val_dataset config to model config. We suggest you manually setnum_classes
in model config. 2024-08-02 10:12:38 [INFO]------------Environment Information------------- platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35 Python: 3.9.19 (main, Apr 6 2024, 17:57:55) [GCC 11.4.0] Paddle compiled with cuda: True NVCC: Build cuda_11.8.r11.8/compiler.31833905_0 cudnn: 8.6 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce', 'GPU 1: NVIDIA GeForce'] GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PaddleSeg: 0.0.0.dev0 PaddlePaddle: 2.6.1 OpenCV: 4.10.0
2024-08-02 10:12:38 [INFO]
---------------Config Information--------------- batch_size: 4 iters: 1000 train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:
type: CrossEntropyLoss model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg
2024-08-02 10:12:38 [INFO] Set device: gpu 2024-08-02 10:12:38 [INFO] Use the following config to build model model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg W0802 10:12:38.203576 221128 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 11.8 W0802 10:12:38.203653 221128 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6. 2024-08-02 10:12:38 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz 2024-08-02 10:12:38 [INFO] There are 265/265 variables loaded into STDCNet. 2024-08-02 10:12:38 [INFO] Use the following config to build train_dataset train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:
false
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 218. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Error: /paddle/paddle/phi/kernels/gpu/cross_entropy_kernel.cu:998 Assertionfalse
failed. The value of label expected >= 0 and < 2, or == 255, but got 89. Please check label value. Traceback (most recent call last): File "/home/xuqing/projects/paddle_seg/PaddleSeg/tools/train.py", line 219, in行,咱就是说,除了ubuntu,本地电脑也不是不能用,直接在windows上跑,竟然没想到啊,它完全可以跑的起来,返回如下: (paddleseg) D:\env\paddleseg\PaddleSeg>python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 500 --do_eval --use_vdl --save_dir output 2024-08-02 10:33:12 [WARNING] Add the
num_classes
in train_dataset and val_dataset config to model config. We suggest you manually setnum_classes
in model config. 2024-08-02 10:33:12 [INFO] ------------Environment Information------------- platform: Windows-10-10.0.19041-SP0 Python: 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)] Paddle compiled with cuda: True NVCC: Build cuda_11.7.r11.7/compiler.31294372_0 cudnn: 8.4 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce'] GCC: gcc (MinGW-W64 x86_64-posix-seh, built by Brecht Sanders) 11.3.0 PaddleSeg: 2.9.0 PaddlePaddle: 2.5.2 OpenCV: 4.8.12024-08-02 10:33:12 [INFO] ---------------Config Information--------------- batch_size: 4 iters: 1000 train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:
type: CrossEntropyLoss model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg
2024-08-02 10:33:12 [INFO] Set device: gpu 2024-08-02 10:33:12 [INFO] Use the following config to build model model: backbone: pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz type: STDC2 num_classes: 2 type: PPLiteSeg W0802 10:33:12.746259 17620 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.7 W0802 10:33:12.746259 17620 gpu_resources.cc:149] device: 0, cuDNN Version: 8.4. 2024-08-02 10:33:13 [INFO] Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz Connecting to https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet2.tar.gz Downloading PP_STDCNet2.tar.gz [==================================================] 100.00% Uncompress PP_STDCNet2.tar.gz [==================================================] 100.00% 2024-08-02 10:33:15 [INFO] There are 265/265 variables loaded into STDCNet. 2024-08-02 10:33:15 [INFO] Use the following config to build train_dataset train_dataset: dataset_root: data/optic_disc_seg mode: train num_classes: 2 train_path: data/optic_disc_seg/train_list.txt transforms:
其他自定义数据(coco的暂时没尝试,就标注是png,图像是jpg的这种普通图像分割的数据类别)也多方尝试,也得到了同样的结果,综上所述,我有两个怀疑: 1,旧版本的paddleseg跑pp_liteseg的模型,不会出问题,新版本的paddleseg(8月1号git clone下来的这个版本)会出现读取Annotations中图像数据的时候有问题,要么是读取灰度图成了三通道图,要么是某个图像包在windows上和ubuntu上的返回不一致 2,windows上的paddleseg和ubuntu上的paddleseg不一样,斗胆猜测是读取数据集这块有个什么问题,导致ubuntu上无法将设置的labels数据和png上的标注的数据对应起来,,
复现环境 Environment
platform: Linux-6.5.0-35-generic-x86_64-with-glibc2.35 Python: 3.9.19 (main, Apr 6 2024, 17:57:55) [GCC 11.4.0] Paddle compiled with cuda: True NVCC: Build cuda_11.8.r11.8/compiler.31833905_0 cudnn: 8.6 GPUs used: 1 CUDA_VISIBLE_DEVICES: None GPU: ['GPU 0: NVIDIA GeForce', 'GPU 1: NVIDIA GeForce'] GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PaddleSeg: 0.0.0.dev0 PaddlePaddle: 2.6.1 OpenCV: 4.10.0
备注:ubuntu22.04 按照教程安装并且可以通过运行检查(sh tests/install/check_predict.sh)
Bug描述确认 Bug description confirmation
是否愿意提交PR? Are you willing to submit a PR?