How to use GPU to run the modes, like to set "-ngl N, -- n-Gpu-layers N" in the llama.cpp project?

kherud / java-llama.cpp

Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++

MIT License

279 stars 28 forks source link

How to use GPU to run the modes, like to set "-ngl N, -- n-Gpu-layers N" in the llama.cpp project? #32

Closed Void-nebula closed 9 months ago

Void-nebula commented 9 months ago

Here's what I've tried in my project:

ModelParameters modelParams = new ModelParameters
                .Builder()
                .setNGpuLayers(nGpuLayers)
                .build();

'nGpuLayers' is a integer which is the same value as when I used in llama.cpp project. However I found in the task manager that the GPU seems to not work at all when the model is running, may I ask why and thank you!

kherud commented 9 months ago

Hi, you are probably using the pre-compiled llama.cpp library of this repository. We currently only provide support for CPU inference since there are too many ways to compile the library. For GPU support, you have to compile the library yourself. Please refer to https://github.com/kherud/java-llama.cpp#setup-required

Void-nebula commented 9 months ago

Hi, you are probably using the pre-compiled llama.cpp library of this repository. We currently only provide support for CPU inference since there are too many ways to compile the library. For GPU support, you have to compile the library yourself. Please refer to https://github.com/kherud/java-llama.cpp#setup-required

Thank you so much!