help,
I have deployed the large model Ollama on an offline environment with Ubuntu 18.04.3, and when running the llama3:8b model, I found that the GPU was not being used, only the CPU was being utilized. Upon checking the logs, I discovered an error message: 'ggml_cuda_init: failed to initialize CUDA: initialization error'."
but this do not work.
What should I do?
pytorch version: 1.4.0
CUDA version: 10.0
GPU configuration: NVIDIA T4
have you checked minimum required CUDA version / NVIDIA driver version for latest ggml? also you can check ggml, and llama.cpp repos for more help on this issue
help, I have deployed the large model Ollama on an offline environment with Ubuntu 18.04.3, and when running the llama3:8b model, I found that the GPU was not being used, only the CPU was being utilized. Upon checking the logs, I discovered an error message: 'ggml_cuda_init: failed to initialize CUDA: initialization error'."
![2](https://github.com/meta-llama/llama3/assets/56375821/d99aae9a-49c9-4199-a741-de85257a3e71)
but this do not work. What should I do?
pytorch version: 1.4.0 CUDA version: 10.0 GPU configuration: NVIDIA T4