Open pugafran opened 7 months ago
Perhaps it is the same as in https://github.com/SciSharp/LLamaSharp/issues/566? The error is the same as when I load .bin but the models I use are from codellama, so I don't know.
.gguf returns "bad file magic"
The "file magic" is a very simple sanity check that the file is the right format, it just checks that the first 4 bytes are the file are the expected "magic number". If you're getting this error it probably means your gguf files are malformed.
if I try to load the bin it gives an error of protected access in memory.
I'm not sure about this, generally .bin indicates that you're using the wrong file type. The protected access violation is a pretty generic error, but llama.cpp often throws this if you pass in bad arguments and it doesn't notice (e.g. the file magic is correct, but the rest of the file is nonsense).
Code sample
That looks reasonable to me, just a couple of small things (that probably aren't relevant to your issue):
EmbeddingMode: true
I'm not sure if you need this on. You need it on if you're going to be embedding things using this model, but I think codellama is for generation not embedding?
This is valid syntax that allows you to init the adapters collection inline. Same thing as what you wrote, just a bit more compact:
var example = new ModelParams("whatever.gguf")
{
LoraAdapters = {
new LoraAdapter("example.gguf", 1.0f),
}
};
I changed it with the code you told me but it still doesn't work as expected. Yes, I generate embeddings of user questions to compare.
I don't know why the .bin works in llama.cpp but not in llamasharp, the problem comes from this function:
@pugafran The problem seems to be in LlamaSharp, but I don´t yet understand the reason.
llama.cpp
In the case of LlamaShap it crash on the moment that apply lora. I don´t have yet a reason.
While searching for a reason, You could use the tool export-lora from llama.cpp to build a resulting model from the base model + lora.
Are you using the same version of llama.cpp as the binaries in LLamaSharp? It's unlikely but possible there's an incompatibility in the file format.
Yes, Im even using the exact commit for LlamaSharp 0.10:
I met the same problem when I load bin in llamasharp.
@blueskyscorpio , Your problem seems to be different. You are trying to load a bin file not a gguf file that is the supported format.
You need to load a supported model on gguf format. (see https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description Supported Models).
I generated my own lora adapters using the finetune executable from the llama.cpp repository, when I tried to use them in llama.cpp using .bin it works, but the .gguf returns "bad file magic". The thing is that in llamasharp, the .gguf tells me the same thing, "bad file magic", but if I try to load the bin it gives an error of protected access in memory.
I used codellama-7b.Q8_0.gguf and codellama-7b-instruct.Q4_K_S.gguf as models to generate that adapters, I would very much like to be able to use the lora adapters.
I didn't see documentation on how to implement it so I did some freestyle decompiling in Visual Studio but I want to believe I'm doing it right: