BlinkDL / ChatRWKV

ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Apache License 2.0
9.44k stars 699 forks source link

AMD Ubuntu 22.04 GPU - guide #15

Open bennmann opened 1 year ago

bennmann commented 1 year ago
bennmann commented 1 year ago

bitsandbytes-rocm also is very challenging to get up and running for 8bit on regular transformers (in steps following after the final steps of this guide)

it may be hardcoded for 5.3 rocm at the time of this writing, this means this guide may be incompatible with bitsandbytes-rocm (the github of this project is not an official AMD one and i won't link it here for that reason, easy to find though)

also, a lot of these issues may be resolved by purchasing AMD Machine Learning silicon (such as MI210) instead of consumer cards, but where's the fun in that (also ain't nobody got that kind of money)

catundercar commented 1 year ago

How about wsl2?

bennmann commented 1 year ago

Wsl 2 does not support AMD ROCM yet at the time of this guide. Please use dual boot methods, or consider switching entirely to Linux (if you can get Proton working for gaming, etc).

I would be happy to learn otherwise whenever this changes.

On Sat, Mar 18, 2023, 1:12 AM CatUnderCar @.***> wrote:

How about wsl2?

— Reply to this email directly, view it on GitHub https://github.com/BlinkDL/ChatRWKV/issues/15#issuecomment-1474716036, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMUTTV33GTVQWC7RXT7VILW4U72LANCNFSM6AAAAAAVCIQKS4 . You are receiving this because you authored the thread.Message ID: @.***>

changtimwu commented 1 year ago

How is the performance?

People has reported that it's around RTX-3070. I am still interested in the performance, especially when using RWKV.

bennmann commented 1 year ago

Performance is ok, 5 tokens per second output for 6900Xt with the largest RWKV 14B model - Nvidia can do better with their tensor cores but 30X0 has ram limitations (except 3090)

There are reports of 3090 getting more than 20 tokens per second RWKV version 0.6.0 using specific settings

However AMD 7X00 should be better with AI gemm functionality, I expect 20 tokens per second whenever ROCM support 7900 series on 14B model with RWKV 0.6.0 version (best guess)

Untested on AMD 7X00 grf11 series

On Sat, Mar 18, 2023, 5:24 PM Tim Wu @.***> wrote:

How is the performance?

People has reported that it's around RTX-3070 https://www.youtube.com/watch?v=HwGgzaz7ipQ. I am still interested in the performance, especially when using RWKV.

— Reply to this email directly, view it on GitHub https://github.com/BlinkDL/ChatRWKV/issues/15#issuecomment-1474997615, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMUTTRKQZMMEU7GRJHTGBDW4YRZZANCNFSM6AAAAAAVCIQKS4 . You are receiving this because you authored the thread.Message ID: @.***>

catundercar commented 1 year ago

Wsl 2 does not support AMD ROCM yet at the time of this guide. Please use dual boot methods, or consider switching entirely to Linux (if you can get Proton working for gaming, etc). I would be happy to learn otherwise whenever this changes. On Sat, Mar 18, 2023, 1:12 AM CatUnderCar @.> wrote: How about wsl2? — Reply to this email directly, view it on GitHub <#15 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMUTTV33GTVQWC7RXT7VILW4U72LANCNFSM6AAAAAAVCIQKS4 . You are receiving this because you authored the thread.Message ID: @.>

Thanks

bennmann commented 1 year ago

Update August 2023:

I kind of dislike containers and usually prefer pure metal, but the below method should work (untested):

Install AMD drivers (may need to chown _apt amd_driver.deb to install), install vim or text editor of choice, enable Universe repositories (basic OS setup) sudo apt-get update sudo apt-get upgrade

go through the docker website installation steps (there are more than a few and must be followed perfectly)

Once you have docker: docker pull rocm/pytorch-nightly sudo docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined rocm/pytorch-nightly

In the running image: cd /home export HSA_OVERRIDE_GFX_VERSION=10.3.0

Install bitsandbytes with ROCM support (optional)

git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6.git bitsandbytes cd bitsandbytes make hip ROCM_TARGET=gfx1030 pip install pip --upgrade pip install .

Install chatrwkv

cd .. pip install --upgrade git+https://github.com/BlinkDL/ChatRWKV

download the model you want from one of the normal places

echo "alias python3='rocm-smi --setfan 99%;python3' #AMD fan curve was not aggressive enough for my cooling" >> ~/.bashrc cd ChatRWKV/v2 vim chat.py # edit in your model location and other parameters you care about python3 chat.py