zhouwg / kantv

workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Apache License 2.0
96 stars 17 forks source link

ggml-qnn: provide a dedicated Android command line program to verify GGML QNN backend on Qualcomm SoC based Android phone #225

Closed zhouwg closed 1 month ago

zhouwg commented 1 month ago

This PR provide a dedicated Android command line program for QNN backend UT(because there are unknown bugs in the test-backend-ops.cpp which provided by the maintainer of ggml backend subsystem) and it works very well and it will be used for add quantize data supportive for QNN backend in the future(there is a known bug in ggml-qnn.cpp, it's another topic, I will fix it in the future). This workload is not essential to me because I had been implemented many testcases in ggml-jni layer for qnn backend.

Unfortunately, the codes of the dedicated command line UT of QNN backend in this PR can't works fine as expected in upstream llama.cpp/whisper.cpp/GGML community because a dependent PR(refine ggml backend subsystem for mixed inference between CPU&GPU / CPU&NPU easily) couldn't be accepted by the maintainer of ggml backend subsystem.

BTW, this PR also fix a bug in ggml-jni which caused by only FP32 was supported in ggml.c and now all testcases of ggml-qnn-backend in ggml-jni layer should/might be bug free accordingly from now on. Report issues of ggml-jni or ggml-qnn are greatly welcomed and appreciated.