Closed xuandy05 closed 6 days ago
Hi @xuandy05 we will release new variant of llama3 compatible to run on-device. Please follow https://github.com/quic/ai-hub-models/tree/main/qai_hub_models/models/llama_v2_7b_chat_quantized/gen_ondevice_llama to run llama2 on-device.
llama3 will also use similar workflow to run on-device. NOTE: you can use current llama3 with above workflow but will changes in config file. please stay tuned and we will update once we we have llama3 flow released
Hello, I am trying to run Llama-v3-8B-Chat on my Android phone using NPU. After exporting the model to the optimized Qualcomm format, how can I run it locally on mobile device with my own prompts? Thank you!