Open ch1y0q opened 6 days ago
@ch1y0q This PR https://github.com/ggerganov/llama.cpp/pull/8014 fix this issue. but it's not approved. You could use old release: https://github.com/ggerganov/llama.cpp/commit/fb76ec31a9914b7761c1727303ab30380fd4f05c or merge the PR to ggerganov/llama.cpp
What happened?
I am using Llama.cpp + SYCL to perform inference on a multiple GPU server. However, I get a Segmentation Fault when using multiple GPUs. The same model can produce inference output correctly with single GPU mode.
Output of
./build/bin/llama-ls-sycl-device
:Name and Version
What operating system are you seeing the problem on?
Linux
Relevant log output