PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
21.8k stars 5.47k forks source link

paddle报错,FatalError: `Segmentation fault` is detected by the operating system. #65242

Open tianjiahao opened 3 weeks ago

tianjiahao commented 3 weeks ago

bug描述 Describe the Bug


C++ Traceback (most recent call last):

0 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, std::map<std::string, std::string, std::less, std::allocator<std::pair<std::string const, std::string > > > const&) 1 void paddle::imperative::Tracer::TraceOpImpl(std::string const&, paddle::imperative::details::NameVarMapTrait::Type const&, paddle::imperative::details::NameVarMapTrait::Type const&, paddle::framework::AttributeMap&, phi::Place const&, bool, std::map<std::string, std::string, std::less, std::allocator<std::pair<std::string const, std::string > > > const&, paddle::framework::AttributeMap*, bool) 2 paddle::platform::is_gpu_place(phi::Place const&)

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: Aborted at 1718681277 (unix time) try "date -d @1718681277" if you are using GNU date ] [SignalInfo: SIGSEGV (@0x19) received by PID 1 (TID 0x7f209add7700) from PID 25 ]

其他补充信息 Additional Supplementary Information

cuda:11.7 python:3.8 paddlepaddle-gpu:2.3.0 在容器中报错,请问有什么方法可以解决

LokeZhou commented 2 weeks ago

看着像 paddle 2.3.0与cuda 11.7版本不适配,可以参考这里换版本或cuda:https://www.paddlepaddle.org.cn/install/old?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html#old-version-anchor-20-%E7%8E%AF%E5%A2%83%E6%94%AF%E6%8C%81

kbwzy commented 1 week ago

Same problem. cuda: 11.4 python: 3.8 paddlepaddle-gpu: 2.3.1 Error in the container, has the above brother solved it?