[question] Can I deploy LLM on SA8295 with Hexagon v68?

quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

https://aihub.qualcomm.com

BSD 3-Clause "New" or "Revised" License

500 stars 80 forks source link

[question] Can I deploy LLM on SA8295 with Hexagon v68? #124

Open ecccccsgo opened 1 week ago

ecccccsgo commented 1 week ago

hello guys, I'm trying to deploy LLM on SA8295 with NPU to accelerate inference. I try serval times but not success with the guidance https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie . but fail. I found Only supports Hexagon v73 and onward architectures. and the SA8295 support v68 on the doc qairt/2.27.7.241014/docs/QNN/general/htp/htp_backend.html.

I hope to know if SA8295 not support the NPU inference :(

Looking forward to your reply. Thank you.

ecccccsgo commented 1 week ago

i found the new chipset Snapdragon 8cx Gen 3 (SC8280X) with v68, and it seem that it support LLM with NPU in https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/overview.html. Although it was PC chipset with msvc.