epic: support ROCm - Githubissues

janhq / cortex

Drop-in, local AI alternative to the OpenAI stack. Multi-engine (llama.cpp, TensorRT-LLM, ONNX). Powers 👋 Jan

https://jan.ai/cortex

1.75k stars 94 forks source link

epic: support ROCm #323

Open hiento09 opened 5 months ago

hiento09 commented 5 months ago

Problem We need a new nitro bin file that support AMD GPU for both windows and linux

Success Criteria

new nitro bin file support AMD GPU
Integrate to Jan app

Parent epic

[x] [KIV] https://github.com/janhq/jan/issues/913

tikikun commented 5 months ago

hi @hiento09 can we spin up the azure instance of AMD GPU and test

tikikun commented 5 months ago

hi @hiento09 can you set up an instance with the specs like the link below, azure is not working

https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45

tikikun commented 5 months ago

Tutorial to compile nitro on that system above (details in link https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45 ):

First step, install AMD things (for details link: https://gist.github.com/cgmb/6ae0d118bf357fc4576a7568b85e1c45) :

After that just build nitro with below flag:

CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ \
    cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS= gfx1011 -DCMAKE_BUILD_TYPE=Release

and

make

hiento09 commented 5 months ago

I am requesting quota for g4ad ec2 instance on aws, will comeback to this task when they open quota for us

hiento09 commented 5 months ago

I was able to build nitro with above flag successfully using this docker image https://hub.docker.com/r/rocm/dev-ubuntu-22.04 but did not have AMD GPU for testing the bin file