Closed ljy6j13 closed 2 months ago
The STEP.0,1,2,3 can all be done in host device. Then, you can cross-compile llama.cpp for your target device. Check #12 for an example on how to deploy onto Android.
The cross-compile processes are currently undocumented and not fully tested yet. We will soon add the support to run_pipeline.py
.
I have added commits to cross-compile for Android. I'm not familiar with S905D3. However, you can take it as an example on how to cross compile for another device.
You can test if it works for Android and open a new issue if you encounter any problems. I will close this issue for now.
I followed the guidance in README to install T-MAC on a S905D3 development board equipped with 4GB RAM, but the process seemed to be lengthy and difficult. During the installation of TVM(automatically executed in "pip install . -e"), a lot of source files need to be compiled, which would take more than 2 hours. Some source files are complex which made the compiler consume more RAM than my device can provide so the compiling processes were killed.
Are there any ways to deploy llama.cpp powered by T-MAC on an embedded device(Linux/Android), without the need to install huge components like python3 and TVM on that device?