Outsider565 / LoRA-GA

126 stars 5 forks source link

RuntimeError: cannot register a hook on a tensor that doesn't require gradient #9

Open qyx1121 opened 4 days ago

qyx1121 commented 4 days ago

你好,我想请问目前可以适配MLLM吗?微调MLLM可能涉及到需要冻住视觉编码器的参数,但是这在estimate_gradient的时候似乎会报错: Traceback (most recent call last): File "finetune_lora_ga.py", line 365, in train() File "finetune_lora_ga.py", line 321, in train named_grad = estimate_gradient( File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/lora_ga_utils.py", line 28, in wrapper result = func(*args, **kwargs) File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/lora_ga_utils.py", line 100, in estimate_gradient with OffloadContext( File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/context.py", line 57, in enter self.gradientOffloadHookContext.enter() File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/gradient_offload.py", line 32, in enter self.register_gradient_hook() File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/gradient_offload.py", line 40, in register_gradient_hook hook = param.register_hook( File "/data/miniconda3/envs/env-3.9.16/lib/python3.9/site-packages/torch/_tensor.py", line 532, in register_hook raise RuntimeError( RuntimeError: cannot register a hook on a tensor that doesn't require gradient

zengls3186428803 commented 4 days ago

被冻结的权重的requires_grad是False, Pytorch无法向不需要梯度的参数注册和梯度有关的hook.

之前的代码里没有考虑冻结部分权重的情况。我更新了一下代码,在dev分支,判断了一下requires_grad属性是否为Ture。 if not parm.requires_grad 就continue,不注册gradient offload hook。

你方便的话可以使用dev分支测试一下问题有没有解决。

======================================== “if not parm.requires_grad 就continue”的合理性,我是这样想的。

被冻结的权重应该是不需要加Lora(或使用LoRA-GA)的,所以不需要梯度, 应该不用估计被冻结的权重的梯度。