How to use ggml_qnn.cpp and ggml_qnn.h in llamp.cpp?

zhouwg / kantv

workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg

Apache License 2.0

110 stars 18 forks source link

How to use ggml_qnn.cpp and ggml_qnn.h in llamp.cpp? #162

Closed QIANXUNZDL123 closed 3 months ago

QIANXUNZDL123 commented 3 months ago

It's a very good project. I want to use ggml_qnn.h in my llama.cpp. For an acceleration using QNN, how should I add ggml_qnn.h to the llama.cpp. Since I'm not an Android developer, I wanted to be able to compile an executable file that would run on the Qualcomm platform llama.cpp. Can you help me

zhouwg commented 3 months ago

pls refer to the following commit in this PR for purpose of use ggml_qnn.h&ggml_qnn.cpp in customized llama.cpp:

https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348ef

or refer to this build script for purpose of compile an executable on Qualcomm platform:

https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-9de16aab6b8949914b9bde4da2a1e158ee7967a708eb3d6645b72297f35de0a2

btw, ggml-qnn.cpp should/could/might be works well as expected for Windows Over ARM with Qualcomm's state-of-the-art desktop SoC and it's just a PoC currently.

QIANXUNZDL123 commented 3 months ago

pls refer to the following commit in this PR for purpose of use ggml_qnn.h&ggml_qnn.cpp in customized llama.cpp:

https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348ef

or refer to this build script for purpose of compile an executable on Qualcomm platform:

https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-9de16aab6b8949914b9bde4da2a1e158ee7967a708eb3d6645b72297f35de0a2

btw, ggml-qnn.cpp should/could/might be works well as expected for Windows Over ARM with Qualcomm's state-of-the-art desktop SoC and it's just a PoC currently.

Thank you very much for your reply, I will try it according to the method linked, and I will continue to follow the project.

zhouwg commented 3 months ago

pls refer to the following commit in this PR for purpose of use ggml_qnn.h&ggml_qnn.cpp in customized llama.cpp: https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348ef or refer to this build script for purpose of compile an executable on Qualcomm platform: https://github.com/ggerganov/llama.cpp/pull/6869/files#diff-9de16aab6b8949914b9bde4da2a1e158ee7967a708eb3d6645b72297f35de0a2 btw, ggml-qnn.cpp should/could/might be works well as expected for Windows Over ARM with Qualcomm's state-of-the-art desktop SoC and it's just a PoC currently.

Thank you very much for your reply, I will try it according to the method linked, and I will continue to follow the project.

it's my pleasure.

zhouwg commented 3 months ago

Hello, I will close this opening issue later if you do not require further assistance. Of course, this issue could be re-opened in the future as your need.

thanks.