PaddlePaddle / PaddleSeg

Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc.
https://arxiv.org/abs/2101.06175
Apache License 2.0
8.66k stars 1.68k forks source link

FatalError: `Segmentation fault` is detected by the operating system. #3644

Open dulicui742 opened 8 months ago

dulicui742 commented 8 months ago

问题确认 Search before asking

Bug描述 Describe the Bug

在执行https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/quick_start_cn.md demo时,python train.py及python val.py均能正常运行,但是在执行python predict.py时报错:

C++ Traceback (most recent call last):

0 ImagingZipEncode 1 deflateReset


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: Aborted at 1707286977 (unix time) try "date -d @1707286977" if you are using GNU date ] [SignalInfo: SIGSEGV (@0x0) received by PID 4193083 (TID 0x7f6cb3650480) from PID 0 ]

Segmentation fault (core dumped)

经过Debug,确认是在脚本的最后执行 pred_mask.save(pred_saved_path) 语句时报错的,进一步调试发现,是在执行ImageFile.py中的errcode, data = encoder.encode(bufsize)[1:]语句时返回的。

复现环境 Environment

------------Environment Information------------- platform: Linux-5.19.0-50-generic-x86_64-with-glibc2.35 Python: 3.9.18 (main, Sep 11 2023, 13:41:44) [GCC 11.2.0] Paddle compiled with cuda: True NVCC: Build cuda_11.7.r11.7/compiler.31294372_0 cudnn: 8.4 GPUs used: 1 CUDA_VISIBLE_DEVICES: 0 GPU: ['GPU 0: NVIDIA RTX', 'GPU 1: NVIDIA RTX', 'GPU 2: NVIDIA RTX', 'GPU 3: NVIDIA RTX'] GCC: gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0 PaddleSeg: 2.9.0 PaddlePaddle: 2.6.0 OpenCV: 4.5.5

使用conda安装的Paddle: conda install paddlepaddle-gpu==2.6.0 cudatoolkit=11.6 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

shiyutang commented 8 months ago

你好,根据你提供的信息,这个错误出现在文件保存中,请进一步查看pred_mask中是否有非法值。

TheMattBin commented 6 months ago

I also faced similar issue as you when I was training my own model which is HrSegNet. I could train my own model in Windows OS but not Linux, my Linux environment is the same as yours. Not sure where cause the issues.

ezone1987 commented 1 month ago

同样的segment fault.