Closed JiaoYanMoGu closed 5 months ago
A1: We did not use QNN, the inference on Snapdragon 888 is now running on CPU. We are still looking for suitable deployment strategy on mobile devices. Any good ideas? A2: We developed some OP with ggml.c according to llama.cpp to fit the architecture of MobileVLM, check here.
With this instruction you are able to run MobileVLM with customized llama.cpp
on android devices. Please let us know if you have any further problems.
Hi, we are closing this issue due to the inactivity. Hope your question has been resolved. If you have any further concerns, please feel free to re-open it or open a new issue. Thanks!
As mentioned in other issues:
Deploying on mobile platform such as Snapdragon 888 is based on llama.cpp, the questions I want to ask is: