workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Apache License 2.0
96
stars
17
forks
source link
ggml-qnn: refine code and keep sync ggml-qnn.cpp&ggml-qnn.h between local and PR in upstream #214
validated on Xiaomi 14(a Qualcomm Snapdragon 8 Gen 3 mobile SoC based Android phone) with following cases:
mulmat with QNN CPU/GPU/NPU backend along different threads(1-8)
qnn-auto-ut(add, mulmat) with QNN CPU/GPU/NPU backend along different threads(1-8). there is a minor bug in automation test of GGML mul OP and I'll fix it in the next commit.
whispercpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)
llamacpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)
validated on Xiaomi 14(a Qualcomm Snapdragon 8 Gen 3 mobile SoC based Android phone) with following cases:
mulmat with QNN CPU/GPU/NPU backend along different threads(1-8)
qnn-auto-ut(add, mulmat) with QNN CPU/GPU/NPU backend along different threads(1-8). there is a minor bug in automation test of GGML mul OP and I'll fix it in the next commit.
whispercpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)
llamacpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)
all above testcases works fine as expected
this PR was reverted. move to https://github.com/zhouwg/kantv/pull/215