hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.52k stars 3.16k forks source link

PiSSA训练和推理的疑问? #4634

Closed ConniePK closed 2 days ago

ConniePK commented 2 days ago

Reminder

System Info

none

Reproduction

我有一个基座模型: /root/.cache/modelscope/hub/qwen/Qwen-72B-Chat 运行pissa_init.py的脚本之后,output_dir如下: /root/.cache/modelscope/hub/qwen/Qwen-72B-Chat-PiSSA/

问题一:训练的时候,要载入哪个模型啊?是:

  --model_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat/' \
  --adapter_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat-PiSSA/pissa_init' \

还是

  --model_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat-PiSSA/' \
  --adapter_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat-PiSSA/pissa_init' \

微调完成了,保存路径为: output_dir

问题二:那推理的时候,是载入哪个模型啊?是:

  --model_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat/' \
  --adapter_name_or_path output_dir \

还是

  --model_name_or_path '/root/.cache/modelscope/hub/qwen/Qwen-72B-Chat-PiSSA/' \
  --adapter_name_or_path output_dir \

另外,在训练的时候还有个--pissa_convert true的参数,添加这个参数,对于问题二,载入有变化吗?

Expected behavior

No response

Others

No response

hiyouga commented 2 days ago

训练时候的参数在脚本里已经给出 推理时候只更换 adapter path