Closed mordesku closed 7 months ago
Thanks for reporting this. Honestly I have never worked with lora adapters, but I will look into this now.
Is the file you are using publicly available somewhere? Also, is it correctly working with the llama.cpp
repository right now, since the file format changed from ggml
to gguf
?
The problem with lora_adapter is that it is not empty according to .empty() method in cpp (but null?) even when I load the quantized gguf model into the GPU and not pass this parameter, which shouldn't use lora_adapter. The parameter was used in llama.cpp/common/common.cpp.
It seems they recently changed the code but up until couple minutes ago it was: https://github.com/ggerganov/llama.cpp/blob/a5661d7e71d15b8dfc81bc0510ba912ebe85dfa3/common/common.cpp#L765C1-L776C6
if (!params.lora_adapter.empty()) {
int err = llama_model_apply_lora_from_file(model,
params.lora_adapter.c_str(),
params.lora_base.empty() ? NULL : params.lora_base.c_str(),
params.n_threads);
if (err != 0) {
fprintf(stderr, "%s: error: failed to apply lora adapter\n", __func__);
llama_free(lctx);
llama_free_model(model);
return std::make_tuple(nullptr, nullptr);
}
}
Condition check !params.lora_adapter.empty()
was true even when the parameter was not passed. So it seems the problem isn't the lora_adapter but the fact that we have a null there instead of an empty string? So maybe setting it to "" would solve the issue. Will try that tomorrow morning.
On second thought, it is not passed correctly as it prints null even if configured.
I have a similar issue, I am passing NO lora adaters in my parameters and get an error message
....................................................................................................
llama_new_context_with_model: kv self size = 800.00 MB
llama_new_context_with_model: compute buffer total size = 75.47 MB
llama_new_context_with_model: VRAM scratch buffer: 74.00 MB
llama_apply_lora_from_file_internal: applying lora adapter from '(null)' - please wait ...
llama_apply_lora_from_file_internal: failed to open '(null)'
llama_init_from_gpt_params: error: failed to apply lora adapter
unable to load modelException in thread "main" de.kherud.llama.LlamaException: could not load model from given file path
at de.kherud.llama.LlamaModel.loadModel(Native Method)
at de.kherud.llama.LlamaModel.<init>(LlamaModel.java:54)```
I just released version 3.0 of the library and this problem should hopefully no longer occur. Feel free to re-open this issue if you still experience problems.
Hi,
I compiled the library with CUDA support on Linux. There is an issue with passing the loraAdapter parameter.
My model parameters look like this:
But in logs, there is a null value:
Strangely, the same issue occurs when I'm not passing this parameter.