ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.23k stars 9.65k forks source link

Feature Request: NPU Support #9181

Open gformcreation opened 2 months ago

gformcreation commented 2 months ago

Prerequisites

Feature Description

First, thank you for your incredible work on this project! To enhance its performance, especially on mobile devices and NPU-enabled PCs like those with Copilot+, I would love to see support for Neural Processing Units (NPUs).

Motivation

Integrating NPU support would significantly improve the speed and efficiency of AI tasks, offering users a smoother and more responsive experience in terms of quick response generation.

Possible Implementation

No response

piDack commented 2 months ago

AMD’s NPU has an implementation in this repository, but its performance is poor. I’ve done some exploration, but I couldn’t even pass the unit tests for basic op, so I believe that support for AMD’s NPU might take a long time, unless AMD deems it worth the effort.

hgftrdw45ud67is8o89 commented 2 months ago

volunteer to test intel AI boost (NPU) here.

hpvd commented 2 months ago

AMD's NPU is XDNA, for current state see https://github.com/ggerganov/llama.cpp/issues/1499

hpvd commented 2 months ago

For INTEL's NPU this may be the right issue https://github.com/ggerganov/llama.cpp/issues/5079

hpvd commented 2 months ago

QUALCOMM's NPU is Hexagon, see https://github.com/ggerganov/llama.cpp/issues/2687

FranzKafkaYu commented 2 months ago

can we support more NPU liked devices like MediaTek APU(AI Processing Unit),its development kit named NeuroPilot SDK,link

themoneyevo commented 1 month ago

any update?