huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
15.96k stars 1.56k forks source link

lora_r is double when converting olora to lora. #2075

Closed JaheimLee closed 6 days ago

JaheimLee commented 1 week ago

System Info

Who can help?

No response

Information

Tasks

Reproduction

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from transformers import AutoModel
from peft import get_peft_model, LoraConfig

base_model = AutoModel.from_pretrained("facebook/opt-350m")
olora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    target_modules='all-linear',
    init_lora_weights='olora',
)
olora_model = get_peft_model(base_model, olora_config)
init_path = './tmp/init'
olora_model.save_pretrained(init_path) # Save the model *before* performing any training

# Train the model
# train(olora_model) # Your training loop

#Save the model after training
olora_model.save_pretrained('./tmp/lora', path_initial_model_for_weight_conversion=init_path) 

Expected behavior

The lora_r of init adapter is 16.

{
  "alpha_pattern": {},
  "auto_mapping": null,
  "base_model_name_or_path": "facebook/opt-350m",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": false,
  "init_lora_weights": true,
  "layer_replication": null,
  "layers_pattern": null,
  "layers_to_transform": null,
  "loftq_config": {},
  "lora_alpha": 32,
  "lora_dropout": 0.05,
  "megatron_config": null,
  "megatron_core": "megatron.core",
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 16,
  "rank_pattern": {},
  "revision": null,
  "target_modules": [
    "k_proj",
    "q_proj",
    "fc1",
    "out_proj",
    "project_out",
    "project_in",
    "v_proj",
    "fc2"
  ],
  "task_type": null,
  "use_dora": false,
  "use_rslora": false
}

But the converted one is 32.

{
  "alpha_pattern": {},
  "auto_mapping": {
    "base_model_class": "OPTModel",
    "parent_library": "transformers.models.opt.modeling_opt"
  },
  "base_model_name_or_path": "facebook/opt-350m",
  "bias": "none",
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "layer_replication": null,
  "layers_pattern": null,
  "layers_to_transform": null,
  "loftq_config": {},
  "lora_alpha": 64,
  "lora_dropout": 0.05,
  "megatron_config": null,
  "megatron_core": "megatron.core",
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 32,
  "rank_pattern": {},
  "revision": null,
  "target_modules": [
    "k_proj",
    "q_proj",
    "fc1",
    "out_proj",
    "project_out",
    "project_in",
    "v_proj",
    "fc2"
  ],
  "task_type": null,
  "use_dora": false,
  "use_rslora": false
}

Model size is also double. Is it as expected?

BenjaminBossan commented 1 week ago

Yes, this is expected. Methods like OLoRA modify the base weights too. When you want to convert the OLoRA weights to LoRA weights, it needs to be ensured that the original base weights can be used. This is only possible by performing some changes on the OLoRA weights, which involves doubling their size. The reason is not quite straightforward to understand but it's explained here (this is for LoftQ but the same idea applies to OLoRA).

Ping @tokenizer-decode for info.

JaheimLee commented 1 week ago

Yes, this is expected. Methods like OLoRA modify the base weights too. When you want to convert the OLoRA weights to LoRA weights, it needs to be ensured that the original base weights can be used. This is only possible by performing some changes on the OLoRA weights, which involves doubling their size. The reason is not quite straightforward to understand but it's explained here (this is for LoftQ but the same idea applies to OLoRA).

Ping @tokenizer-decode for info.

Got it, thanks for your reply

JaheimLee commented 1 week ago

Yes, this is expected. Methods like OLoRA modify the base weights too. When you want to convert the OLoRA weights to LoRA weights, it needs to be ensured that the original base weights can be used. This is only possible by performing some changes on the OLoRA weights, which involves doubling their size. The reason is not quite straightforward to understand but it's explained here (this is for LoftQ but the same idea applies to OLoRA).

Ping @tokenizer-decode for info.

Found a new problem. After converting to lora model, r and alpha of the base model will be 2r and 2alpha. So maybe it's better to reset them to r and alpha after saving is finished.

BenjaminBossan commented 1 week ago

Good point @JaheimLee, I created a PR to address that: #2077.