Closed minboo closed 2 weeks ago
您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快~
Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API,FAQ,Github Issue and AI community to get the answer.Have a nice day!
建议先尝试备份并移除提示的文件:/usr/local/python3.7.5/lib/python3.7/site-packages/paddle/fluid/libpaddle.so
造成这个问题的原因是 patchelf 在 ARMv8 下的识别的 page-size 不一致导致的
详细原因可以参考 https://github.com/NixOS/patchelf/pull/216 的PR描述,建议在您的环境中运行如下命令,将编译和运行环境中的 patchelf 版本进行升级,升级之后再重新编译和运行即可
wget -O /opt/0.14.5.tar.gz https://github.com/NixOS/patchelf/archive/refs/tags/0.14.5.tar.gz && \ cd /opt && tar xzf 0.14.5.tar.gz && cd /opt/patchelf-0.14.5 && ./bootstrap.sh && ./configure && \ make && make install && cd /opt && rm -rf patchelf-0.14.5 && rm -rf 0.14.5.tar.gz
造成这个问题的原因是 patchelf 在 ARMv8 下的识别的 page-size 不一致导致的
详细原因可以参考 NixOS/patchelf#216 的PR描述,建议在您的环境中运行如下命令,将编译和运行环境中的 patchelf 版本进行升级,升级之后再重新编译和运行即可
wget -O /opt/0.14.5.tar.gz https://github.com/NixOS/patchelf/archive/refs/tags/0.14.5.tar.gz && cd /opt && tar xzf 0.14.5.tar.gz && cd /opt/patchelf-0.14.5 && ./bootstrap.sh && ./configure && make && make install && cd /opt && rm -rf patchelf-0.14.5 && rm -rf 0.14.5.tar.gz
您好,我根据您说的更新了patchelf的版本,上述问题解决了,但是验证安装时出现以下报错
root@localhost:/home/Paddle/build# python3
Python 3.7.10 (default, Mar 15 2021, 20:52:10)
[GCC 10.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0313 19:29:32.091966 180539 interpretercore.cc:282] New Executor is Running.
I0313 19:29:32.099856 180539 interpreter_util.cc:574] Standalone Executor is Used.
PaddlePaddle works well on 1 CPU.
/opt/conda/lib/python3.7/site-packages/paddle/distributed/spawn.py:305: UserWarning: Your model will be trained under CPUONLY mode by using GLOO,because CPUPlace is specified manually or your installed PaddlePaddle only support CPU Device.
"Your model will be trained under CPUONLY mode by using GLOO,"
grep: grep: warning: GREP_OPTIONS is deprecated; please use an alias or scriptwarning: GREP_OPTIONS is deprecated; please use an alias or script
I0313 19:29:32.950970 180739 tcp_utils.cc:179] The server starts to listen on IP_ANY:53387
I0313 19:29:32.951071 180740 tcp_utils.cc:128] Successfully connected to 127.0.0.1:53387
I0313 19:29:32.951176 180739 tcp_utils.cc:128] Successfully connected to 127.0.0.1:53387
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.
----------------------
Error Message Summary:
----------------------
FatalError: `Termination signal` is detected by the operating system.
[TimeInfo: *** Aborted at 1678706973 (unix time) try "date -d @1678706973" if you are using GNU date ***]
[SignalInfo: *** SIGTERM (@0x2c13b) received by PID 180739 (TID 0xfffc81b128b0) from PID 180539 ***]
WARNING:root:PaddlePaddle meets some problem with 2 CPUs. This may be caused by:
1. There is not enough GPUs visible on your system
2. Some GPUs are occupied by other process now
3. NVIDIA-NCCL2 is not installed correctly on your system. Please follow instruction on https://github.com/NVIDIA/nccl-tests
to test your NCCL, or reinstall it following https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html
WARNING:root:
Original Error is:
----------------------------------------------
Process 1 terminated with the following error:
----------------------------------------------
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/spawn.py", line 394, in _func_wrapper
result = func(*args)
File "/opt/conda/lib/python3.7/site-packages/paddle/utils/install_check.py", line 199, in train_for_run_parallel
paddle.distributed.init_parallel_env()
File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/parallel.py", line 1104, in init_parallel_env
pg_options=None,
File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/collective.py", line 152, in _new_process_group_impl
pg = core.ProcessGroupGloo.create(store, rank, world_size, group_id)
AttributeError: module 'paddle.fluid.libpaddle' has no attribute 'ProcessGroupGloo'
PaddlePaddle is installed successfully ONLY for single CPU! Let's start deep learning with PaddlePaddle now.
请问这是什么原因?因为我的宿主机有NPU,我是在装有cann的docker容器里编译的,下面给出我docker
的run
命令:
docker run -itd --name npu-cann502 -v /home/myfile:/home/myfile \
--pids-limit 409600 --network=host --shm-size=128G \
--cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
--device=/dev/davinci0 --device=/dev/davinci1 \
--device=/dev/davinci2 --device=/dev/davinci3 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/dcmi:/usr/local/dcmi \
paddlepaddle/paddle:latest-dev-cann5.0.2.alpha005-gcc82-aarch64 /bin/bash
我不知道是哪里出了问题,能否告知在华为NPU下如何进行编译?
造成这个问题的原因是 patchelf 在 ARMv8 下的识别的 page-size 不一致导致的 详细原因可以参考 NixOS/patchelf#216 的PR描述,建议在您的环境中运行如下命令,将编译和运行环境中的 patchelf 版本进行升级,升级之后再重新编译和运行即可 wget -O /opt/0.14.5.tar.gz https://github.com/NixOS/patchelf/archive/refs/tags/0.14.5.tar.gz && cd /opt && tar xzf 0.14.5.tar.gz && cd /opt/patchelf-0.14.5 && ./bootstrap.sh && ./configure && make && make install && cd /opt && rm -rf patchelf-0.14.5 && rm -rf 0.14.5.tar.gz
您好,我根据您说的更新了patchelf的版本,上述问题解决了,但是验证安装时出现以下报错
root@localhost:/home/Paddle/build# python3 Python 3.7.10 (default, Mar 15 2021, 20:52:10) [GCC 10.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import paddle grep: warning: GREP_OPTIONS is deprecated; please use an alias or script >>> paddle.utils.run_check() Running verify PaddlePaddle program ... I0313 19:29:32.091966 180539 interpretercore.cc:282] New Executor is Running. I0313 19:29:32.099856 180539 interpreter_util.cc:574] Standalone Executor is Used. PaddlePaddle works well on 1 CPU. /opt/conda/lib/python3.7/site-packages/paddle/distributed/spawn.py:305: UserWarning: Your model will be trained under CPUONLY mode by using GLOO,because CPUPlace is specified manually or your installed PaddlePaddle only support CPU Device. "Your model will be trained under CPUONLY mode by using GLOO," grep: grep: warning: GREP_OPTIONS is deprecated; please use an alias or scriptwarning: GREP_OPTIONS is deprecated; please use an alias or script I0313 19:29:32.950970 180739 tcp_utils.cc:179] The server starts to listen on IP_ANY:53387 I0313 19:29:32.951071 180740 tcp_utils.cc:128] Successfully connected to 127.0.0.1:53387 I0313 19:29:32.951176 180739 tcp_utils.cc:128] Successfully connected to 127.0.0.1:53387 -------------------------------------- C++ Traceback (most recent call last): -------------------------------------- No stack trace in paddle, may be caused by external reasons. ---------------------- Error Message Summary: ---------------------- FatalError: `Termination signal` is detected by the operating system. [TimeInfo: *** Aborted at 1678706973 (unix time) try "date -d @1678706973" if you are using GNU date ***] [SignalInfo: *** SIGTERM (@0x2c13b) received by PID 180739 (TID 0xfffc81b128b0) from PID 180539 ***] WARNING:root:PaddlePaddle meets some problem with 2 CPUs. This may be caused by: 1. There is not enough GPUs visible on your system 2. Some GPUs are occupied by other process now 3. NVIDIA-NCCL2 is not installed correctly on your system. Please follow instruction on https://github.com/NVIDIA/nccl-tests to test your NCCL, or reinstall it following https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html WARNING:root: Original Error is: ---------------------------------------------- Process 1 terminated with the following error: ---------------------------------------------- Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/spawn.py", line 394, in _func_wrapper result = func(*args) File "/opt/conda/lib/python3.7/site-packages/paddle/utils/install_check.py", line 199, in train_for_run_parallel paddle.distributed.init_parallel_env() File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/parallel.py", line 1104, in init_parallel_env pg_options=None, File "/opt/conda/lib/python3.7/site-packages/paddle/distributed/collective.py", line 152, in _new_process_group_impl pg = core.ProcessGroupGloo.create(store, rank, world_size, group_id) AttributeError: module 'paddle.fluid.libpaddle' has no attribute 'ProcessGroupGloo' PaddlePaddle is installed successfully ONLY for single CPU! Let's start deep learning with PaddlePaddle now.
请问这是什么原因?因为我的宿主机有NPU,我是在装有cann的docker容器里编译的,下面给出我
docker
的run
命令:docker run -itd --name npu-cann502 -v /home/myfile:/home/myfile \ --pids-limit 409600 --network=host --shm-size=128G \ --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \ --device=/dev/davinci0 --device=/dev/davinci1 \ --device=/dev/davinci2 --device=/dev/davinci3 \ --device=/dev/davinci_manager \ --device=/dev/devmm_svm \ --device=/dev/hisi_hdc \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/dcmi:/usr/local/dcmi \ paddlepaddle/paddle:latest-dev-cann5.0.2.alpha005-gcc82-aarch64 /bin/bash
我不知道是哪里出了问题,能否告知在华为NPU下如何进行编译?
能分享一下编译生成的whe文件吗,多谢楼主了,我的邮箱是zhujihuai@gmail.com
请问有编译好的适配昇腾NPU的paddlepaddle的whl文件吗
Since you haven\'t replied for more than a year, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. 由于您超过一年未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。
问题描述 Issue Description
编译出paddlepaddle的whl包后进行import paddle出现以下错误:
下面是cmake输出的日志:
版本&环境信息 Version & Environment Information
PaddlePaddle版本:develop分支下编译 操作系统:宿主机UOS-1050e, 使用的镜像为
docker pull registry.baidubce.com/paddlepaddle/serving:ascend-aarch64-cann3.3.0-paddlelite-devel
Ubuntu18.04 CPU:宿主机鲲鹏920NPU:Atlas310I 型号3000 Python版本:python3.7.5 gcc:8.4.0 cmake版本:3.16.8 按照这个文档编译:https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/compile/arm-compile.html cmake命令:
cmake .. -DPY_VERSION=3.7 -DPYTHON_EXECUTABLE=
which python3-DWITH_ARM=ON -DWITH_TESTING=OFF -DCMAKE_BUILD_TYPE=Release -DON_INFER=ON -DWITH_XBYAK=OFF