ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.8k stars 9.73k forks source link

Error converting Lora adapters: unrecognized tensor #3651

Closed Mortadha-abderrhim closed 1 year ago

Mortadha-abderrhim commented 1 year ago

Expected Behavior

Convert Lora adapters for Mistral to ggml using convert-lora-to-ggml.py Convert Lora adapters for LLama2 to ggml using convert-lora-to-ggml.py

Current Behavior

same error for both Mistral and LLama2 Error: unrecognized tensor base_model.model.lm_head.lora_A.weight

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Failure Information (for bugs)

Here is the adapter_config: "inference_mode": true, "init_lora_weights": true, "layers_pattern": null, "layers_to_transform": null, "lora_alpha": 16, "lora_dropout": 0.05, "modules_to_save": null, "peft_type": "LORA", "r": 8, "rank_pattern": {}, "revision": null, "target_modules": [ "gate_proj", "q_proj", "k_proj", "o_proj", "v_proj", "down_proj", "up_proj", "lm_head" ], "task_type": "CAUSAL_LM" }

ayourtch commented 1 year ago

I got rid of the error with a following diff, but I just looked at how the layers are named elsewhere - I have no idea if what I have done is correct. However, the resulting LoRA adapter seems to take effect.


diff --git a/convert-lora-to-ggml.py b/convert-lora-to-ggml.py
index a937410..e423056 100755
--- a/convert-lora-to-ggml.py
+++ b/convert-lora-to-ggml.py
@@ -44,8 +44,16 @@ def translate_tensor_name(t: str) -> str:
         )
         return output_string
     else:
-        print(f"Error: unrecognized tensor {t}")
-        sys.exit(1)
+        match = re.match("base_model.model.lm_head.lora_(A|B).weight", t)
+        if match:
+            lora_type = match.group(1)
+            output_string = (
+                f"output.weight.lora{lora_type}"
+            )
+            return output_string
+        else:
+            print(f"Error: unrecognized tensor {t}")
+            sys.exit(1)

 def write_file_header(fout: BinaryIO, params: dict[str, Any]) -> None:
Mortadha-abderrhim commented 1 year ago

It worked, thanks for the fix.

TahaYasinB commented 9 months ago

can @ayourtch you explain how to use the solution provided above ? I am new to this

ayourtch commented 9 months ago

In short - you replace the lines starting with “-“ with the lines starting with “+”. Mind you, I have very little idea of what I am doing there, just that it seemed to work for me when I tried it :-)