Pelochus / ezrknpu

Easy usage of Rockchip's NPUs found in RK3588 and similar chips
GNU General Public License v3.0
85 stars 5 forks source link
llm neural-networks npu rk3588 rockchip

ezrknpu

Easy usage of Rockchip's NPU found in RK3588 and similar chips. This includes ChatGPT-like LLMs and models like YoloV5. This repo is divided in two submodules:

Apart from that, you can find converted LLMs (and how to download them) here:

Currently focusing only on RK3588 and RK3588S.

Demo Videos

Check out these links to see some LLMs in action:

Tutorial from scratch

You can check this XDA article on how to use this from scratch to run LLMs on an Orange Pi 5 Pro:

https://www.xda-developers.com/how-i-used-the-npu-on-my-orange-pi-5-pro-to-run-llms/

This tutorial aims to run Phi-3 mini on the NPU of an OPi 5 Pro 16GB RAM

Requirements

Keep in mind this repo is focused for:

Recommended OS

I recommend using either Armbian (May 2024 builds or later) or Ubuntu Rockchip by Joshua Riek. Anything that has the NPU driver 0.9.6 or later (usually May 2024 onwards):

Quick Install

This will install required dependencies, packages, libraries and install both RKNN Toolkit 2 and RKNN LLM.

Run:

curl https://raw.githubusercontent.com/Pelochus/ezrknpu/main/install.sh | sudo bash

Custom install

You can download the install.sh script in this repo and run it with:

Running

Depending on what you want to run, please refer to these links:

Test Run

Currently, the lightest (works on 4GB of RAM) and best working model is Qwen 1.8B Chat. To run it:

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Pelochus/qwen-1_8B-rk3588 # Running git lfs pull after is usually better
cd qwen-1_8B-rk3588 && git lfs pull # Pull model
rkllm qwen-chat-1_8B.rkllm # Run!

Wait for about a minute before the model loads. If something fails, perhaps it is a good idea keep reading below.

Downloading and running a LLM on your NPU

For downloading:

For running an already downloaded model:

Converting a compatible model to RKLLM format (check previously Rockchip's docs on rknn-llm repo). This needs a x86 PC:

Running and using the RKNN-Toolkit for NN applications

https://github.com/Pelochus/ezrknn-toolkit2/?tab=readme-ov-file#test


Checking NPU info

NPU usage

You have 3 options for checking usage:

NPU driver

Run dmesg | grep -i rknpu and it should output something like:

[    7.648610] RKNPU fdab0000.npu: Adding to iommu group 0
[    7.648747] RKNPU fdab0000.npu: RKNPU: rknpu iommu is enabled, using iommu mode
[    7.648893] RKNPU fdab0000.npu: Looking up rknpu-supply from device tree
[    7.650056] RKNPU fdab0000.npu: Looking up mem-supply from device tree
[    7.652808] RKNPU fdab0000.npu: can't request region for resource [mem 0xfdab0000-0xfdabffff]
[    7.652838] RKNPU fdab0000.npu: can't request region for resource [mem 0xfdac0000-0xfdacffff]
[    7.652859] RKNPU fdab0000.npu: can't request region for resource [mem 0xfdad0000-0xfdadffff]
[    7.653197] [drm] Initialized rknpu 0.9.5 20240226 for fdab0000.npu on minor 1

Useful Links

Contributing

Please open an issue or PR on the corresponding submodule repo. If unsure, open it on this repo:

Currently (and mainly) there are the following contributions to be made:

Credits

To the r/RockchipNPU subreddit, which helped me tremendously with testing and discovering why things failed and how to solve those issues.