shibing624 / textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
Apache License 2.0
937 stars 109 forks source link

lora 训练参数 #40

Open MonkeyTB opened 1 year ago

MonkeyTB commented 1 year ago

trainable params: 22021440 || all params: 6195307952 || trainable%: 0.35545351692954874

请教一下,更新之后的lora为啥训练参数占比这么高了,之前只有0.05左右

MonkeyTB commented 1 year ago

trainable params: 22021440 || all params: 6195307952 || trainable%: 0.35545351692954874

请教一下,更新之后的lora为啥训练参数占比这么高了,之前只有0.05左右

原始老版本的lora trainable params: 3670016 || all params: 6176956416 || trainable%: 0.05941463324063059

shibing624 commented 1 year ago

参考qlora的论文,改了lora_target_module默认参数,lora_target_modules= all linear layers, 全部dense层参与训练,效果最好。

想改回之前的,设置lora_target_modules参数即可。

stale[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问)