Closed Tlntin closed 2 years ago
您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快~
Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API,FAQ,Github Issue and AI community to get the answer.Have a nice day!
您好,感谢您的分享和共享,请问还有什么问题需要跟进嘛
补充一条,版本号可以在执行cmake命令前export PADDLE_VERSION="2.3.1"
让程序正确识别。
没有了。
补充一条,版本号可以在执行cmake命令前
export PADDLE_VERSION="2.3.1"
让程序正确识别。 没有了。
非常感谢你的安装指导,参考你的步骤逐渐安装后,碰到如下cu代码编译问题:
1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/reverse.cu". 1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/pad.cu". make[2]: [paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/build.make:314:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/pad.cu.o] 错误 1 1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/broadcast.cu". make[2]: [paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/build.make:230:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/broadcast.cu.o] 错误 1 make[1]: [CMakeFiles/Makefile2:56411:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/all] 错误 2 make[1]: 正在等待未完成的任务.... [ 10%] Linking CXX static library libscope.a [ 10%] Built target scope make: *** [Makefile:136:all] 错误 2
你在编译过程中有碰到这些问题吗?若有是怎么解决的呢?
补充一条,版本号可以在执行cmake命令前
export PADDLE_VERSION="2.3.1"
让程序正确识别。 没有了。非常感谢你的安装指导,参考你的步骤逐渐安装后,碰到如下cu代码编译问题:
1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/reverse.cu". 1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/pad.cu". make[2]: [paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/build.make:314:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/pad.cu.o] 错误 1 1 error detected in the compilation of "/home/dell/code/Paddle/paddle/phi/kernels/funcs/eigen/broadcast.cu". make[2]: [paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/build.make:230:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/broadcast.cu.o] 错误 1 make[1]: [CMakeFiles/Makefile2:56411:paddle/phi/kernels/funcs/eigen/CMakeFiles/eigen_function.dir/all] 错误 2 make[1]: 正在等待未完成的任务.... [ 10%] Linking CXX static library libscope.a [ 10%] Built target scope make: *** [Makefile:136:all] 错误 2
你在编译过程中有碰到这些问题吗?若有是怎么解决的呢?
我没遇到这些问题,你的cuda,cudnn,nccl环境是?是否都装在/usr/local/cuda?
费了半天劲,从20.04升级到22.04,发现paddle不行了,哎。。。
飞桨develop 版本已经支持ubuntu 22.04,2.4版本会发版支持
DISTRIB_RELEASE=22.04 DISTRIB_DESCRIPTION="Ubuntu 22.04.1 LTS"
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
Thanks for shortening the journey, Build time 1h 35m
Paddle 2.4会支持ubuntu 22.04吗,预计什么时候发布啊。
笑死 "Ubuntu 24.04 LTS" 居然也有这个问题
为啥要自己编译
环境
自带python环境(其实没啥影响,只是展示一下)
cmake环境(建议版本装高一下,貌似要3.19以上)
CMake suite maintained and supported by Kitware (kitware.com/cmake).
系统描述
cuda环境
cudnn环境
由于conda安装的gcc不会读取系统环境的c/c++ include,所以cudnn只能用tar包的方式安装。
选择的tar.xz的包为:cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz
简易安装教程如下:
结果如下,已能识别到cudnn, 8.4.1
/sbin/ldconfig.real: Path `/usr/lib' given more than once (from:0 and :0)
libcudnn_ops_train.so.8 -> libcudnn_ops_train.so.8.4.1
libcudnn_cnn_train.so.8 -> libcudnn_cnn_train.so.8.4.1
libcudnn_ops_infer.so.8 -> libcudnn_ops_infer.so.8.4.1
libcudnn_adv_infer.so.8 -> libcudnn_adv_infer.so.8.4.1
libcudnn.so.8 -> libcudnn.so.8.4.1
libcudnn_adv_train.so.8 -> libcudnn_adv_train.so.8.4.1
libcudnn_cnn_infer.so.8 -> libcudnn_cnn_infer.so.8.4.1
准备工作
export PATH=${PYTHON_LIBRARY}:$PATH
find
dirname $(dirname $(which python3))
/include -name "python3.9" > /tmp/temp2 && export PYTHON_INCLUDE_DIRS=$(cat /tmp/temp2 | xargs -L 1)export PYTHON3_EXECUTABLE=$(for dirname in
whereis python3
; do echo $dirname > /tmp/tmp3 | cat /tmp/tmp3 | grep env ; done;)export PYTHON3_NUMPY_INCLUDE_DIRS=
python -c "import numpy as np; print(np.__path__[0] + '/core/include')"
echo PYTHON3_NUMPY_INCLUDE_DIRS=$PYTHON3_NUMPY_INCLUDE_DIRS$ g++ --version g++ (conda-forge gcc 8.5.0-16) 8.5.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
编译过程
拉取源码,切换最新分支
创建并进入build
设置目标paddle版本
准备编译(未开启TensorRT)
正式编译(注意,该步骤需要科学上网,因为make的时候需要从github拉取第三方库源码),大概等待个1-2小时左右,差不多就可以了。
error too many open files
,需要修改最大打开文件限制,默认是1024获取安装包,安装包在build目录下面的python/dist目录下,文件属性如下:
安装安装包(理论上和我相同cuda/cudnn/nccl版本,且cudnn/nccl都为zip安装,30系列显卡的ubuntu22.04/20.04都能用该包),为啥版本显示0.0.0,是因为所有自己编译的都这么显示。
测试效果
跑一下官方测试代码,貌似也正常,可以正常用GPU进行训练。
设置使用GPU
paddle.device.set_device("gpu:0")
from paddle.vision.transforms import Normalize from paddle.vision.datasets import MNIST from paddle.vision.models import LeNet import numpy as np
拉取数据集
transform = Normalize(mean=[127.5], std=[127.5], data_format="CHW") train_dataset = MNIST(mode="train", transform=transform) valid_dataset = MNIST(mode="test", transform=transform)
获取数据集类别
y_list = [da[1][0] for da in train_dataset] num_list = list(set(y_list)) num_classes = len(num_list) print(f"数据集标签共有{num_classes}种, 分别为:{num_list}")
构建模型
pre_mdoel = LeNet(num_classes=num_classes) model = paddle.Model(pre_mdoel) adam = paddle.optimizer.Adam(learning_rate=1e-3, parameters=model.parameters()) model.prepare(adam, loss=paddle.nn.CrossEntropyLoss(), metrics=paddle.metric.Accuracy())
训练模型
model.fit(train_data=train_dataset, batch_size=64, verbose=1, epochs=5)