Open qyx1121 opened 4 days ago
被冻结的权重的requires_grad是False, Pytorch无法向不需要梯度的参数注册和梯度有关的hook.
之前的代码里没有考虑冻结部分权重的情况。我更新了一下代码,在dev分支,判断了一下requires_grad属性是否为Ture。 if not parm.requires_grad 就continue,不注册gradient offload hook。
你方便的话可以使用dev分支测试一下问题有没有解决。
======================================== “if not parm.requires_grad 就continue”的合理性,我是这样想的。
被冻结的权重应该是不需要加Lora(或使用LoRA-GA)的,所以不需要梯度, 应该不用估计被冻结的权重的梯度。
你好,我想请问目前可以适配MLLM吗?微调MLLM可能涉及到需要冻住视觉编码器的参数,但是这在estimate_gradient的时候似乎会报错: Traceback (most recent call last): File "finetune_lora_ga.py", line 365, in
train()
File "finetune_lora_ga.py", line 321, in train
named_grad = estimate_gradient(
File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/lora_ga_utils.py", line 28, in wrapper
result = func(*args, **kwargs)
File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/lora_ga_utils.py", line 100, in estimate_gradient
with OffloadContext(
File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/context.py", line 57, in enter
self.gradientOffloadHookContext.enter()
File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/gradient_offload.py", line 32, in enter
self.register_gradient_hook()
File "LoRA-GA/peft/src/peft/utils/lora_ga_utils/offload_utils_for_quant/gradient_offload.py", line 40, in register_gradient_hook
hook = param.register_hook(
File "/data/miniconda3/envs/env-3.9.16/lib/python3.9/site-packages/torch/_tensor.py", line 532, in register_hook
raise RuntimeError(
RuntimeError: cannot register a hook on a tensor that doesn't require gradient