Unable to use lora in llamasharp but can use it in llama.cpp

pugafran commented 7 months ago

I generated my own lora adapters using the finetune executable from the llama.cpp repository, when I tried to use them in llama.cpp using .bin it works, but the .gguf returns "bad file magic". The thing is that in llamasharp, the .gguf tells me the same thing, "bad file magic", but if I try to load the bin it gives an error of protected access in memory.

I used codellama-7b.Q8_0.gguf and codellama-7b-instruct.Q4_K_S.gguf as models to generate that adapters, I would very much like to be able to use the lora adapters.

I didn't see documentation on how to implement it so I did some freestyle decompiling in Visual Studio but I want to believe I'm doing it right:

AdapterCollection adapters = new AdapterCollection();
adapters.Add(new LoraAdapter("..\\..\\..\\..\\..\\CoPilot\\data\\lora.bin", 1.0f));

            var parameters = new ModelParams(modelPath)
            {
                LoraAdapters = adapters,
                ContextSize = 2048,
                Seed = 1337,
                GpuLayerCount = 15,
                EmbeddingMode = true,
                Threads = (uint)(Environment.ProcessorCount * 0.7)
            };

pugafran commented 7 months ago

Perhaps it is the same as in https://github.com/SciSharp/LLamaSharp/issues/566? The error is the same as when I load .bin but the models I use are from codellama, so I don't know.

martindevans commented 7 months ago

.gguf returns "bad file magic"

The "file magic" is a very simple sanity check that the file is the right format, it just checks that the first 4 bytes are the file are the expected "magic number". If you're getting this error it probably means your gguf files are malformed.

if I try to load the bin it gives an error of protected access in memory.

I'm not sure about this, generally .bin indicates that you're using the wrong file type. The protected access violation is a pretty generic error, but llama.cpp often throws this if you pass in bad arguments and it doesn't notice (e.g. the file magic is correct, but the rest of the file is nonsense).

Code sample

That looks reasonable to me, just a couple of small things (that probably aren't relevant to your issue):

EmbeddingMode: true I'm not sure if you need this on. You need it on if you're going to be embedding things using this model, but I think codellama is for generation not embedding?

This is valid syntax that allows you to init the adapters collection inline. Same thing as what you wrote, just a bit more compact:

var example = new ModelParams("whatever.gguf")
{
    LoraAdapters = {
        new LoraAdapter("example.gguf", 1.0f),
    }
};

pugafran commented 7 months ago

I changed it with the code you told me but it still doesn't work as expected. Yes, I generate embeddings of user questions to compare.

I don't know why the .bin works in llama.cpp but not in llamasharp, the problem comes from this function:

SignalRT commented 7 months ago

@pugafran The problem seems to be in LlamaSharp, but I don´t yet understand the reason.

llama.cpp

I finetune the LlamaSharp example model: llama-2-7b-chat.Q4_0.gguf with an example dataset.
I test the LORA with llama.cpp. It loads without problems.

In the case of LlamaShap it crash on the moment that apply lora. I don´t have yet a reason.

While searching for a reason, You could use the tool export-lora from llama.cpp to build a resulting model from the base model + lora.

martindevans commented 7 months ago

Are you using the same version of llama.cpp as the binaries in LLamaSharp? It's unlikely but possible there's an incompatibility in the file format.

pugafran commented 7 months ago

Yes, Im even using the exact commit for LlamaSharp 0.10:

blueskyscorpio commented 7 months ago

I met the same problem when I load bin in llamasharp. 微信图片_20240322125826 微信图片_20240322125720

SignalRT commented 7 months ago

@blueskyscorpio , Your problem seems to be different. You are trying to load a bin file not a gguf file that is the supported format.

You need to load a supported model on gguf format. (see https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description Supported Models).

SciSharp / LLamaSharp

Unable to use lora in llamasharp but can use it in llama.cpp #618