janhq / cortex

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers 👋 Jan
https://jan.ai/cortex
1.75k stars 94 forks source link

epic: support ROCm #323

Open hiento09 opened 5 months ago

hiento09 commented 5 months ago

Problem We need a new nitro bin file that support AMD GPU for both windows and linux

Success Criteria

Parent epic

tikikun commented 5 months ago

hi @hiento09 can we spin up the azure instance of AMD GPU and test

tikikun commented 5 months ago

hi @hiento09 can you set up an instance with the specs like the link below, azure is not working

https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45

tikikun commented 5 months ago

Tutorial to compile nitro on that system above (details in link https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45 ):

First step, install AMD things (for details link: https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45) :

image

After that just build nitro with below flag:

CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ \
    cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS= gfx1011 -DCMAKE_BUILD_TYPE=Release 

and

make
hiento09 commented 5 months ago

I am requesting quota for g4ad ec2 instance on aws, will comeback to this task when they open quota for us

hiento09 commented 5 months ago

I was able to build nitro with above flag successfully using this docker image https://hub.docker.com/r/rocm/dev-ubuntu-22.04 but did not have AMD GPU for testing the bin file image

hiento09 commented 5 months ago

image AWS reject our quota request, we may need to consider purchasing for some AMD GPU, @tikikun , @dan-jan

hiepxanh commented 5 months ago

I have one, let me test tomorrow 🗡️ @hiento09

Do you have file so I can test? Or what do I need to do to built it if need?

hiepxanh commented 5 months ago

@hiento09 can you provide bin file? I cannot make the build run on my WSL 2 windows

my target CPU is: gfx1032 (radeon 6600xt) if you need

image

tikikun commented 5 months ago

rename to ROCm support since vulkan already supported

tikikun commented 4 months ago

still have ROCm build left @hiento09 @hiro-v

louis-jan commented 4 months ago

Experimental feature: 0.4.7

hiro-v commented 4 months ago

@louis-jan No it's not This is AMD RoCm (equivalent to NVIDIA CUDA), not Vulkan. I'm moving this one back to Icebox as of now

Van-QA commented 4 months ago

converting this to epic to close https://github.com/janhq/jan/issues/913

0xSage commented 2 weeks ago

@Van-QA can we queue this up after trtllm for Cam? 🙏

Van-QA commented 2 weeks ago

hi @0xSage, if you look at the sprint / status, you can see that it's already in the n‌ext spr‌int