yuanzhoulvpi2017 / zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)
MIT License
2.81k stars 351 forks source link

【lora权重合并】chatglm-6b v2 lora微调:如何加载微调好的lora参数,进行二次微调 #140

Open AlanTubring opened 1 year ago

AlanTubring commented 1 year ago

chatglm-6b v2 lora微调:如何加载微调好的lora参数,进行二次微调

image

我在微调中把 model = get_peft_model(model, config) 换成 peft_model_id = "output/adgen-chatglm2-6b-lora_version/checkpoint-24000" model = PeftModel.from_pretrained(model, peft_model_id)

但是可训练参数变成了0 image

是我的加载方式不对吗大佬?

yuanzhoulvpi2017 commented 1 year ago

是不对的,你需要合并lora训练后的模型,我这一两天发布一个合并的教程。

yuanzhoulvpi2017 commented 1 year ago

您好,让您久等了,参考我这个文件infer_lora.ipynb

在这个文件,里面教你如何合并lora

#%%
from transformers import AutoTokenizer, AutoModel
from peft import PeftModel, PeftConfig
import torch
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

model_name_or_path = "/media/yuanz/新加卷/训练代码/chatglm6b_v2_0716/chatglm2-6b_model"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)#.half().cuda()

peft_model_id = "output/adgen-chatglm2-6b-lora_version/checkpoint-880"
model = PeftModel.from_pretrained(model, peft_model_id)
model = model.eval()

# 合并lora
model_merge = model.merge_and_unload()
merger_lora_model_path = "test_merge_dir"

model_merge.save_pretrained(merge_lora_model_path, max_shard_size="2GB")
tokenizer.save_pretrained(merge_lora_model_path)

这个时候,会生成一个文件夹"test_merge_dir",这个文件夹,就是你合并后的lora权重,后面你再训练,就用这个模型即可

AlanTubring commented 1 year ago

感谢大佬的解答!!!!万分感谢!!!!

Re-fused commented 1 year ago

您好,让您久等了,参考我这个文件infer_lora.ipynb

在这个文件,里面教你如何合并lora

#%%
from transformers import AutoTokenizer, AutoModel
from peft import PeftModel, PeftConfig
import torch
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

model_name_or_path = "/media/yuanz/新加卷/训练代码/chatglm6b_v2_0716/chatglm2-6b_model"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)#.half().cuda()

peft_model_id = "output/adgen-chatglm2-6b-lora_version/checkpoint-880"
model = PeftModel.from_pretrained(model, peft_model_id)
model = model.eval()

# 合并lora
model_merge = model.merge_and_unload()
merger_lora_model_path = "test_merge_dir"

model_merge.save_pretrained(merge_lora_model_path, max_shard_size="2GB")
tokenizer.save_pretrained(merge_lora_model_path)

这个时候,会生成一个文件夹"test_merge_dir",这个文件夹,就是你合并后的lora权重,后面你再训练,就用这个模型即可

大佬,您能讲一下为啥,二次训练是模型进行合并,是lora合并到我们的基座模型上么?然后我们再次训练的时候,再次增加一个lora么,这里不是特别理解