Closed Baymax0525 closed 1 year ago
您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快~
Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API,FAQ,Github Issue and AI community to get the answer.Have a nice day!
这个warning在npu上可以忽略,说暂时不支持eager模式,但是一般不会影响跑程序。请问具体遇到什么错误了呢?
这个warning在npu上可以忽略,说暂时不支持eager模式,但是一般不会影响跑程序。请问具体遇到什么错误了呢?
/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py:189: UserWarning: We will fallback into legacy dygraph on NPU/XPU/MLU/IPU/ROCM devices. Because we only support new eager dygraph mode on CPU/GPU currently.
"We will fallback into legacy dygraph on NPU/XPU/MLU/IPU/ROCM devices. Because we only support new eager dygraph mode on CPU/GPU currently. "
Running verify PaddlePaddle program ...
I1014 09:06:06.187234 54891 interpretercore.cc:235] New Executor is Running.
Traceback (most recent call last):
File "
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.7/site-packages/paddle/utils/install_check.py", line 266, in run_check
_run_static_single(use_cuda, use_xpu, use_npu)
File "/opt/conda/lib/python3.7/site-packages/paddle/utils/install_check.py", line 156, in _run_static_single
input, out, weight = _simple_network()
File "/opt/conda/lib/python3.7/site-packages/paddle/utils/install_check.py", line 33, in _simple_network
attr=paddle.ParamAttr(initializer=paddle.nn.initializer.Constant(0.1)))
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/layers/tensor.py", line 151, in create_parameter
default_initializer)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/layer_helper_base.py", line 383, in create_parameter
**attr._to_kwargs(with_initializer=True))
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3790, in create_parameter
initializer(param, self)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/initializer.py", line 54, in __call__
return self.forward(param, block)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/initializer.py", line 191, in forward
stop_gradient=True)
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3840, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2748, in __init__
for frame in traceback.extract_stack():
ResourceExhaustedError: Not enough available NPU memory.
[Hint: Expected available_to_alloc > 0, but received available_to_alloc:0 <= 0:0.] (at /home/longiuser/workspace/quanjia/Paddle/paddle/fluid/platform/device/npu/npu_info.cc:160)
[operator < fill_constant > error]
我查看npu-smi info 显卡是正常的 +------------------------------------------------------------------------------+ | npu-smi 21.0.2 Version: 21.0.2 | +-------------------+-----------------+----------------------------------------+ | NPU Name | Health | Power(W) Temp(C) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===================+=================+========================================+ | 976 310 | OK | 12.8 59 | | 0 0 | 0000:3D:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 992 310 | OK | 12.8 60 | | 0 1 | 0000:3E:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 1008 310 | OK | 12.8 62 | | 0 2 | 0000:3F:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 1024 310 | OK | 12.8 60 | | 0 3 | 0000:40:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 2176 310 | OK | 12.8 63 | | 0 4 | 0000:88:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 2192 310 | OK | 12.8 63 | | 0 5 | 0000:89:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 2208 310 | OK | 12.8 62 | | 0 6 | 0000:8A:00.0 | 0 2703 / 8192 | +===================+=================+========================================+ | 2224 310 | OK | 12.8 63 | | 0 7 | 0000:8B:00.0 | 0 2703 / 8192 | +===================+=================+========================================+
目前NPU支持的是ascend 910,310未适配
目前NPU支持的是ascend 910,310未适配
这样啊,谢谢您。我刚接触npu,能查找到的资料很少,您有哪些可以查阅的资源可以分享一下吗?谢谢啦
我在ascend 910安装paddle成功了,但是提示不支持动态图(only support new eager dygraph mode on CPU/GPU)
(base) λ ai-training /PaddleSeg {release/2.6} python -c "import paddle; paddle.utils.run_check()"
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py:189: UserWarning: We will fallback into legacy dygraph on NPU/XPU/MLU/IPU/ROCM devices. Because we only support new eager dygraph mode on CPU/GPU currently.
"We will fallback into legacy dygraph on NPU/XPU/MLU/IPU/ROCM devices. Because we only support new eager dygraph mode on CPU/GPU currently. "
Running verify PaddlePaddle program ...
I1017 11:12:41.262614 57120 interpretercore.cc:235] New Executor is Running.
I1017 11:12:48.488742 57120 interpretercore_util.cc:430] Standalone Executor is Used.
PaddlePaddle works well on 1 NPU.
PaddlePaddle works well on 1 NPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
新动态图模式在NPU上尚未支持,但是也可以使用旧动态图正常使用paddle训练,请问有没有遇到具体报错呢?
eager模式在NPU上尚未支持,但是也可以正常使用paddle训练,请问有没有遇到具体报错呢?
没有报错,只是训练时间与在CPU上一样长。npu-smi info查询显卡使用情况如下
+------------------------------------------------------------------------------------+
| npu-smi 1.8.20 Version: 20.2.2 |
+----------------------+---------------+---------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) |
| Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
+======================+===============+=============================================+
| 1 910B | OK | 82.7 79 |
| 0 | 0000:3B:00.0 | 0 2170 / 15505 0 / 32255 |
+======================+===============+=============================================+
在程序开始时调用 paddle.device.set_device("npu")
在程序开始时调用 paddle.device.set_device("npu") 我用的paddleseg测试的,增加了--device=npu,显卡的确在使用,但是训练时间还是没有减小。我用的是paddle2.3编译安装的,请问和版本有关系吗?
python train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512.yml --device=npu --iters 6000 --do_eval --save_interval 20 --save_dir output/pp_liteseg_optic_disc
+------------------------------------------------------------------------------------+ | npu-smi 1.8.20 Version: 20.2.2 | +----------------------+---------------+---------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) | | Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) | +======================+===============+=============================================+ | 1 910B | OK | 83.3 80 | | 0 | 0000:3B:00.0 | 3 2325 / 15505 28384/ 32255 | +======================+===============+=============================================+
hi, @Baymax0525
由于 CANN 算子库中缺少了很多 PaddleSeg 模型所需的算子,因此目前 PaddleSeg 类型的模型中存在较多算子尚未有 NPU 的算子实现,因此此类模型功能上能跑,但是缺失的算子会默认 fallback 到 CPU 上运行,这会导致实际模型运行性能很差,基本是接近 CPU 的性能水平。
后续我们会尝试联系华为 CANN 开发这部分缺失算子,以补齐 PaddleSeg 的缺失算子。同时,目前我们已经升级到 CANN 512的最新版本了,可以尝试使用这个 registry.baidubce.com/device/paddle-npu:cann512-x86_64-gcc82
这个镜像,后续我们也会同步更新飞桨在昇腾上的最新模型和镜像到飞桨官网。
谢谢!
问题描述 Issue Description
我是用的docker镜像是:https://hub.docker.com/r/paddlepaddle/paddle/tags?page=1&name=cann的latest-dev-cann5.0.2.alpha005-gcc82-x86_64 但是官网只提供了基于arm架构的编译方式,如下: https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/hardware_support/npu_docs/paddle_install_cn.html 由于我用的机器是x86架构,所以编译参数去掉了-DWITH_ARM=ON, 完整的编译参数是 cmake .. -DPY_VERSION=3.7 -DWITH_ASCEND=OFF -DWITH_ASCEND_CL=ON -DWITH_ASCEND_INT64=ON -DWITH_DISTRIBUTE=ON -DWITH_TESTING=ON -DON_INFER=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON 编译完成后得到的paddle安装包是paddlepaddle_npu-0.0.0-cp37-cp37m-linux_x86_64.whl pip安装后,import paddle提示信息如下 UserWarning: We will fallback into legacy dygraph on NPU/XPU/MLU/IPU/ROCM devices. Because we only support new eager dygraph mode on CPU/GPU currently. 在https://gitee.com/ascend/modelzoo/issues/I571T4#note_13626125看到有人编译安装成功了,但是我没有成功,有哪位同学可以帮忙看一下吗?谢谢
版本&环境信息 Version & Environment Information
docker 镜像安装