Closed barryyfli closed 1 week ago
Describe the bug/ 问题描述 (Mandatory / 必填) 使用AdamW优化器配置分组参数后进行训练,报错:
ValueError: For primitive[Adam], the var_shape: [1024,1024,] must be equal to [50265,1024,] head_param = list(map(id, model.classifier.parameters()))
ValueError: For primitive[Adam], the var_shape: [1024,1024,] must be equal to [50265,1024,]
head_param = list(map(id, model.classifier.parameters()))
others_param = filter(lambda p: id(p) not in head_param, model.parameters())
optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)
lr_scheduler = get_linear_schedule_with_warmup( optimizer=optimizer, num_warmup_steps=0.06 (len(train_dataset) num_epochs), num_training_steps=(len(train_dataset) * num_epochs), )
- **Hardware Environment(`Ascend`/`GPU`/`CPU`) / 硬件环境**: > /device ascend/CPU - **Software Environment / 软件环境 (Mandatory / 必填)**: -- MindSpore version: 2.4.0 -- Python version: Python 3.9.19 -- OS platform and distribution: Windows 11 -- GCC/Compiler version (if compiled from source): - **Excute Mode / 执行模式 (Mandatory / 必填)(`PyNative`/`Graph`)**: > Please delete the mode not involved / 请删除不涉及的模式: > /mode pynative > /mode graph **To Reproduce / 重现步骤 (Mandatory / 必填)** 以ia3模型为例: 1. 打开 'llm\peft\ia3\sequence_classification.ipynb' 2. 修改优化器的配置代码: 将原代码:
optimizer = AdamW(params=model.parameters(), lr=lr)
修改为:
head_param = list(map(id, model.classifier.parameters())) others_param = filter(lambda p: id(p) not in head_param, model.parameters()) head_lr = 6e-3 fft_lr = 6e-2 optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)
3. 运行代码进行训练 4. 在训练过程中报错 ![image](https://github.com/user-attachments/assets/8cf7255e-32d7-4741-beb9-7541ffa09b08) **Expected behavior / 预期结果 (Mandatory / 必填)** 模型能够正常训练 **Screenshots/ 日志 / 截图 (Mandatory / 必填)** ![bebdcfb9acd2f0b27a5dda837fbb8f58](https://github.com/user-attachments/assets/eefc9485-c684-47e3-982e-9076c7b2326c) ![image](https://github.com/user-attachments/assets/8303f62c-42e6-4ec6-b976-6125f8433818) **Additional context / 备注 (Optional / 选填)** Add any other context about the problem here.
使用最新包解决:
from mindnlp.core import value_and_grad grad_fn = value_and_grad(forward_fn, model.parameters()) for data in dataset: optimizer.zero_grad() loss = grad_fn(**data) optimizer.step()
Describe the bug/ 问题描述 (Mandatory / 必填) 使用AdamW优化器配置分组参数后进行训练,报错:
others_param = filter(lambda p: id(p) not in head_param, model.parameters())
报错,于PyTorch可正常运行
optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)
optimizer = AdamW(params=model.parameters(), lr=fft_lr) # 可正常运行
Instantiate scheduler
lr_scheduler = get_linear_schedule_with_warmup( optimizer=optimizer, num_warmup_steps=0.06 (len(train_dataset) num_epochs), num_training_steps=(len(train_dataset) * num_epochs), )
optimizer = AdamW(params=model.parameters(), lr=lr)
head_param = list(map(id, model.classifier.parameters())) others_param = filter(lambda p: id(p) not in head_param, model.parameters()) head_lr = 6e-3 fft_lr = 6e-2
optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)