mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.2k stars 1.58k forks source link

[Feature Request] NPU Support #1361

Open hchasens opened 11 months ago

hchasens commented 11 months ago

🚀 Feature

NPU support, particularly on smartphones featuring the Google Tensor (Google Pixel) and newer Snapdragon chips; as well as the newer iPhone chips.

Motivation

More devices than ever are coming out with NPUs. The Pixel lineup has featured it for a while, Snapdragon's more recent chips feature them, and so are Apple's chips (both in their iPhone and Mac). They're becoming more prevalent and look to be here to stay, making a trifecta of CPU, GPU, and TPU. I'd suspect that on many platforms NPUs will increase the speed and efficiency of LLMs over their GPU counterparts. I think this will be an important step to get mobile local LLMs mainstream. The hardware is there, we just need the software to use it.

I'm sure you're working on it already but I thought I'd post an official feature request for it.

Leokratis commented 7 months ago

Aren't the NPUs closed? I think I read somewhere they are not open to developers yet (at least the Snapdragon Chips)

Ofc I could be mistaken

hchasens commented 7 months ago

The NNAPI can use NPUs/TPUs. For example, TensorFlow Lite allows you to select delagates, one of which is the NPU.