workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Apache License 2.0
96
stars
17
forks
source link
ggml-jni: troubleshooting MiniCPM-V inference on xiaomi14 #204
MiniCPM-V is a GPT-4V Level Multimodal LLM provided by OpenBMB(https://github.com/OpenBMB/MiniCPM-V). OpenBMB has provides official gguf format models which are very helpful for programmers/developers:
//for users in China, https://modelscope.cn/models/OpenBMB/MiniCPM-Llama3-V-2_5-gguf/files
//for users outside of China, https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf/tree/main
MiniCPM-V command line application can works fine on Ubuntu20.04 but MiniCPM-V inference will crash on Xiaomi14(this is also an example to illustrate why one of programmers from ggml community should pay some efforts on Android although there are some command line applications provided in llama.cpp and these command line applications(such as main in llama.cpp) works fine on Linux/Windows/Mac).
This PR is intent to troubleshooting and fix this issue in project KanTV.
Why this issue happened on Android
how to build and run MiniCPM-V command line application on Linux
git checkout master(master branch is preferred since 05-29-2024)
cd ${PROJECT_ROOT_PATH}/core/ggml/llamacpp
make clean
make
./minicpmv-cli -m ggml-model-Q4_K_M.gguf --mmproj mmproj-model-f16.gguf --image test.jpeg -p "What is in the image?" -t 4
we observe that the command line application works fine on Linux
how to build and run MiniCPM-V inference on Android
git checkout minicpm-v
build apk accordingly
running apk on Android phone
we observe that the apk will crash on Xiaomi14.
adb logcat | grep KANTV
NOTE: the source code for command line application on Linux and Android APK is same.
Root cause
the root cause is that C++ exception with NDK does not work on Android default.
the method to fix this issue could be found at master branch after this PR is merged to master branch.
finetune inference performance on Xiaomi14 (without QNN backend):
TODO
MiniCPM-V inference in Android APK on Android using QNN backend will crash(the progress and result of MiniCPM-V inference has done successfully but crash in stage of resource cleanup) , this is a known issue(also happened in LLM inference but already fixed with a workaroud/dirty method). I will try to fix it in the future.
Purpose
MiniCPM-V is a GPT-4V Level Multimodal LLM provided by OpenBMB(https://github.com/OpenBMB/MiniCPM-V). OpenBMB has provides official gguf format models which are very helpful for programmers/developers:
MiniCPM-V command line application can works fine on Ubuntu20.04 but MiniCPM-V inference will crash on Xiaomi14(this is also an example to illustrate why one of programmers from ggml community should pay some efforts on Android although there are some command line applications provided in llama.cpp and these command line applications(such as main in llama.cpp) works fine on Linux/Windows/Mac).
NOTE: this issue also happened(before 05-28-2024) with original MiniCPM-V in https://github.com/OpenBMB although the command line minicpm-v application could works fine on Android in https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv. BTW/in the fact, the minicpm-v jni codes in project KanTV were borrowed from https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv directly.
This PR is intent to troubleshooting and fix this issue in project KanTV.
Why this issue happened on Android
we observe that the command line application works fine on Linux
we observe that the apk will crash on Xiaomi14.
NOTE: the source code for command line application on Linux and Android APK is same.
Root cause
the root cause is that C++ exception with NDK does not work on Android default.
https://google.github.io/styleguide/cppguide.html#Exceptions![Screenshot from 2024-05-28 09-16-57](https://github.com/zhouwg/kantv/assets/6889919/9f019fa9-a55f-491f-b372-a2005f25c7bb)
How to resolve
the method to fix this issue could be found at master branch after this PR is merged to master branch.
finetune inference performance on Xiaomi14 (without QNN backend):![1449686473](https://github.com/zhouwg/kantv/assets/6889919/de54b356-c0ef-4080-b096-39ebe557527d)
TODO
MiniCPM-V inference in Android APK on Android using QNN backend will crash(the progress and result of MiniCPM-V inference has done successfully but crash in stage of resource cleanup) , this is a known issue(also happened in LLM inference but already fixed with a workaroud/dirty method). I will try to fix it in the future.