leafspark / AutoGGUF

automatically quant GGUF models
Apache License 2.0
137 stars 12 forks source link

Can't merge lora and base gguf. [BUG] #7

Open JohnClaw opened 3 months ago

JohnClaw commented 3 months ago

I tried to merge adapter_model.safetensors and unsloth.Q8_0.gguf using your tool. Both were taken from here: https://huggingface.co/klei1/bleta-8b Got this error:

image I'm a total nube and possibly was doing something stupid. Help me, please.

leafspark commented 3 months ago

Is this using the Export LoRA tool? You need to select the folder where the adapter is in the LoRA Conversion section, and then specify the output path (use GGML type). The resulting .bin adapter model can be merged with the GGUF in the Export LoRA section.

JohnClaw commented 3 months ago

Thanks for answer. I followed your instructions: image But got the same error when i clicked "Convert LorRA" button:

image

leafspark commented 3 months ago

Hmm, can you paste the logs here? It should be available if you double click on the task.

JohnClaw commented 3 months ago

I found two log files in log subfolder. Here they are. latest_20240808_021741.log lora_conversion_20240808_021829.log

leafspark commented 3 months ago

Seems like PyYAML is missing, you can install that and any other required dependencies with this command: pip install PyQt6 psutil shutil requests numpy<2.0.0 torch sentencepiece PyYAML

JohnClaw commented 3 months ago

Got new log message: ERROR:lora-to-gguf:Error: param modules_to_save is not supported

leafspark commented 3 months ago

That might happen if some layers were trained without LoRA, unsloth may do that. In that case you need to contact the repo owners and ask for the base model (it's likely meta-llama/Meta-Llama-3.1-8B-Instruct) or merged safetensors (there's a HF to GGUF section that can easily convert them), and download that from HF and use the GGUF conversion mode with the model folder supplied to the Base Model box (it's newer and officially supported, although it requires a base model).

JohnClaw commented 3 months ago

I followed your advice and made lora for another fresh Albanian model: https://huggingface.co/Kushtrim/Phi-3-medium-4k-instruct-sq. So autogguf created a small lora gguf file (128 mb in size). But i still can't understand: how can i merge this small gguf lora file and big base phi-3-medium-4k-instruct gguf file (i can download it from Bartowski HF page (https://huggingface.co/bartowski/Phi-3-medium-4k-instruct-GGUF) or from other gguf makers)? Your tool supports merging loras in ggml bin format only.

leafspark commented 3 months ago

Apologies that was an oversight on my part, I updated the release to support *.gguf files: https://github.com/leafspark/AutoGGUF/releases/tag/v1.5.1 Now you can use the Export LoRA utility with the bartowski GGUF, which should output a merged one. For the Albanian model you shared that one already has merged safetensors, therefore you can clone the repo and put it in the HF to GGUF section which should be easier, the only downside is that it only supports q8_0/bf16/fp16/fp32 export.