Closed zhouwg closed 3 months ago
Did you try ExecuTorch delegate? https://github.com/pytorch/executorch/tree/main/backends/qualcomm
Thanks for your response.
Deploy ExecuTorch on Android is not a easy thing(it seems there are so many dependencies with ExecuTorch on Android) and I hasn't try that. Could you help to explain what's a special approach was used in ExecuTorch with QNN API(I has read the QNN SDK reference manual many times)?
I'll try the ExecuTorch on Android later.thanks.
Check out ExecuTorch Android demo app which can leverage QNN - https://pytorch.org/executorch/stable/demo-apps-android.html
Thanks so much for you guidance. I'm trying the ExecuTorch on Android but failed in setup stage of ExecuTorch's dev envs which caused by GFW:
fatal: the remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
fatal: clone of 'https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git' into submodule path '/home/weiguo/executorch/backends/vulkan/third-party/VulkanMemoryAllocator' failed
Failed to clone 'backends/vulkan/third-party/VulkanMemoryAllocator'. Retry scheduled
Cloning into '/home/weiguo/executorch/backends/xnnpack/third-party/XNNPACK'...
Cloning into '/home/weiguo/executorch/examples/models/llama2/third-party/abseil-cpp'...
Cloning into '/home/weiguo/executorch/examples/third-party/LLaVA'...
Cloning into '/home/weiguo/executorch/kernels/optimized/third-party/eigen'...
Cloning into '/home/weiguo/executorch/third-party/flatbuffers'...
Cloning into '/home/weiguo/executorch/third-party/flatcc'...
Cloning into '/home/weiguo/executorch/third-party/gflags'...
Cloning into '/home/weiguo/executorch/third-party/googletest'...
Cloning into '/home/weiguo/executorch/third-party/ios-cmake'...
Cloning into '/home/weiguo/executorch/third-party/prelude'...
Cloning into '/home/weiguo/executorch/third-party/pybind11'...
Cloning into '/home/weiguo/executorch/backends/vulkan/third-party/VulkanMemoryAllocator'...
Submodule path 'backends/arm/third-party/ethos-u-core-driver': checked out '90f9df900acdc0718ecd2dfdc53780664758dec5'
Submodule path 'backends/arm/third-party/serialization_lib': checked out '187af0d41fe75d08d2a7ec84c1b4d24b9b641ed2'
Submodule path 'backends/vulkan/third-party/Vulkan-Headers': checked out '0c5928795a66e93f65e5e68a36d8daa79a209dc2'
Submodule path 'backends/vulkan/third-party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191'
Submodule path 'backends/vulkan/third-party/volk': checked out 'b3bc21e584f97400b6884cb2a541a56c6a5ddba3'
Submodule path 'backends/xnnpack/third-party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3'
Submodule path 'backends/xnnpack/third-party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1'
Submodule path 'backends/xnnpack/third-party/XNNPACK': checked out '20c0d886fb78d6497362e8303b999bf5d67aaa02'
Submodule path 'backends/xnnpack/third-party/cpuinfo': checked out 'd6860c477c99f1fce9e28eb206891af3c0e1a1d7'
Submodule path 'backends/xnnpack/third-party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8'
Submodule path 'examples/models/llama2/third-party/abseil-cpp': checked out '854193071498f330b71083d7e06a7cd18e02a4cc'
Submodule path 'examples/models/llama2/third-party/re2': checked out 'ac82d4f628a2045d89964ae11c48403d3b091af1'
Submodule path 'examples/third-party/LLaVA': checked out '7440ec9ee37b0374c6b5548818e89878e38f3353'
Submodule path 'examples/third-party/fbjni': checked out '52a14f0daa889a20d8984798b8d96eb03cebd334'
Submodule path 'kernels/optimized/third-party/eigen': checked out 'a39ade4ccf99df845ec85c580fbbb324f71952fa'
Submodule path 'third-party/flatbuffers': checked out '0100f6a5779831fa7a651e4b67ef389a8752bd9b'
Submodule path 'third-party/flatcc': checked out 'eb5228f76d395bffe31a33398ff73e60dfba5914'
Submodule path 'third-party/gflags': checked out 'a738fdf9338412f83ab3f26f31ac11ed3f3ec4bd'
Submodule path 'third-party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
Submodule path 'third-party/ios-cmake': checked out '06465b27698424cf4a04a5ca4904d50a3c966c45'
Submodule path 'third-party/prelude': checked out '4e9e6d50b8b461564a7e351ff60b87fe59d7e53b'
Submodule path 'third-party/pybind11': checked out '8c7b8dd0ae74b36b7d42f77b0dd4096ebb7f4ab1'
Submodule path 'backends/vulkan/third-party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191'
(executorch) weiguo:$ ./install_requirements.sh
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/nightly/cpu
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f72a2601810>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/torch/
export https_proxy="http://127.0.0.1:8119"
(executorch) weiguo:$ ./install_requirements.sh
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/nightly/cpu
Collecting torch==2.4.0.dev20240507
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.4.0.dev20240507%2Bcpu-cp310-cp310-linux_x86_64.whl (192.7 MB)
━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/192.7 MB <b>126.9 kB/s</b> eta 0:24:28
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.4.0.dev20240507%2Bcpu-cp310-cp310-linux_x86_64.whl (192.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/192.7 MB 75.1 kB/s eta 0:42:21
The following dependent file is so big and the network speed is so slow/unstable on my side and I have to skip this step.
(executorch) weiguo:$ python -m examples.qualcomm.scripts.deeplab_v3 -b build_android -m SM8550 --compile_only --download
QNN_SDK_ROOT=/opt/qcom/aistack/qnn/2.20.0.240223
LD_LIBRARY_PATH=/opt/qcom/aistack/qnn/2.20.0.240223/lib/x86_64-linux-clang/:
Downloading http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to ./deeplab_v3/voc_image/VOCtrainval_11-May-2012.tar
0%| | 1900544/1999639040 [00:11<3:51:06, 144066.45it/s] 0%| | 1900544/1999639040 [00:11<3:20:52, 165749.57it/s]
I'll try check out ExecuTorch Android demo app accordingly after fix that issue on my side. I'll update this loop accordingly if I have any positive progress.
Thanks again.
@zhouwg you need to set HTP precision custom configuration to fp16 to get a reasonable floating point performance. Your case looks like a fp3 addition and HTP doens't offically support it.
And you can see this tracee
[qnn_sdk_logcallback, 874]: 0.0ms [WARNING] <W> Specified config SOC, ignoring on real target
QNN ignore SOC configuration because the program runs on a real target.... QNN is able to detect it.
mmm but I'm not sure if ExecuTorch repository is a correct place to ask QNN questions. You might want to check Quallcomm QPM forum instead.
@zhouwg you need to set HTP precision custom configuration to fp16 to get a reasonable floating point performance. Your case looks like a fp3 addition and HTP doens't offically support it.
And you can see this tracee
[qnn_sdk_logcallback, 874]: 0.0ms [WARNING] <W> Specified config SOC, ignoring on real target
QNN ignore SOC configuration because the program runs on a real target.... QNN is able to detect it.
mmm but I'm not sure if ExecuTorch repository is a correct place to ask QNN questions. You might want to check Quallcomm QPM forum instead.
Thanks so much for helps me again.
I'm sorry for that and I'll submit ticket in Qualcomm's QPM forum next time.
Thanks. Or you can ping me in GGML pull-requests. ( from the trace [ggml_qnn_add, 2229]
I guess you're doing ggml things)
Thanks. Or you can ping me in GGML pull-requests. ( from the trace
[ggml_qnn_add, 2229]
I guess you're doing ggml things)
thanks too much. I can't find the the trace in upstream. can we discuss this problem in my personal learning&study project:https://github.com/zhouwg/kantv/tree/ggml-qnn-quantize/core/ggml/llamacpp/tests/ggml-qnn ?
I have a Qualcomm SnapDragon 8 Gen 3 equipped Android phone and trying run QNN SDK(/opt/qcom/aistack/qairt/2.23.0.240531) on this Android phone, I found there are two strange questions:
1.failed to specified config SoC
Could anyone technical expert from QTI help to explain why these happend? thanks so much.