zhouwg / kantv

workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Apache License 2.0
96 stars 17 forks source link

ggml-qnn: refine code and keep sync ggml-qnn.cpp&ggml-qnn.h between local and PR in upstream #215

Closed zhouwg closed 1 month ago

zhouwg commented 1 month ago

validated on Xiaomi 14(a Qualcomm Snapdragon 8 Gen 3 mobile SoC based Android phone) with following cases:

mulmat with QNN CPU/GPU/NPU backend along different threads(1-8)

qnn-auto-ut(add, mulmat) with QNN CPU/GPU/NPU backend along different threads(1-8). there is a bug in automation test of GGML mul OP.

whispercpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)

llamacpp inference with QNN CPU/GPU/NPU backend along different threads(1-8)

all above testcases works fine as expected