microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
35.44k stars 4.11k forks source link

[BUG]What is the meaning of setting "actor_lora_module_name" without "only_optimize_lora"? #3363

Open NostalgiaOfTime opened 1 year ago

NostalgiaOfTime commented 1 year ago

I see much shell file include "--actor_lora_module_name" but without "only_optimize_lora". According to the source code, this will cause all the model parameters will be trained instead of only LoRA layers.

so could tell me the meaning of setting actor_lora_module_name without only_optimize_lora

NostalgiaOfTime commented 1 year ago

Is any update, I still confuse about it

xinghuang2050 commented 11 months ago

I am also confused about the parameter "--only_optimize_lora". What is the fundamental difference between setting and not setting this parameter?

EeyoreLee commented 10 months ago

@NostalgiaOfTime @xinghuang2050 - hi, actor_lora_module_name is used to filter which linear should be changed to lora linear. And only_optimize_lora is used to select whether optimize full-parameters or only the Lora matrix A and B.