Closed userbox020 closed 5 months ago
How do you run cublas on a AMD GPU? All I can find is ROCm supports cuda but in true amd fashion no mention on how it actually works
How do you run cublas on a AMD GPU? All I can find is ROCm supports cuda but in true amd fashion no mention on how it actually works
sub bro, im not expert on cuda, rocm nor cublas, i just been digging on my free time these last weeks to make work my amd cards work with these new AI stuff. It doesnt use cublas, rocm uses hipBLAS and somehow not sure if i installed an extra ubuntu package but seems like hipBLAS somehow send data to CUDA, because you can use cuda env variables to set or disable gpus, also to test them with pythorch. However if i remember right, llamacpp doesnt use torch nor pythorch for backend
sub bro, im not expert on cuda, rocm nor cublas, i just been digging on my free time these last weeks to make work my amd cards work with these new AI stuff. It doesnt use cublas, rocm uses hipBLAS and somehow not sure if i installed an extra ubuntu package but seems like hipBLAS somehow send data to CUDA, because you can use cuda env variables to set or disable gpus, also to test them with pythorch. However if i remember right, llamacpp doesnt use torch not pythorch for backend
Ty very much
sup guys, I solve my problem compiling like the follow
make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gxf1030
Setting the follow environmental variables
export ROCM_PATH=/opt/rocm
export HCC_AMDGPU_TARGET=gfx1030
export HSA_OVERRIDE_GFX_VERSION=10.3.0
Now im able to run one rx 6900, two rx 6800 and one rx 6700 all together in multigpu. Everything working great now! but im using an old mobo with pcie x1 gen1 and take long time to load models, but when loaded the inference time its fast.
btw, im using rocm5.6
It loads fine and do inference fine with just one gpu, but when i add a second gop i get the follow output from console
The follow my loading settings: