mindspore-lab / mindnlp

Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
https://mindnlp.cqu.ai/
Apache License 2.0
708 stars 201 forks source link

AdamW优化器配置分组参数报错 #1828

Closed barryyfli closed 1 week ago

barryyfli commented 1 week ago

Describe the bug/ 问题描述 (Mandatory / 必填) 使用AdamW优化器配置分组参数后进行训练,报错:

ValueError: For primitive[Adam], the var_shape: [1024,1024,] must be equal to [50265,1024,]


head_param = list(map(id, model.classifier.parameters()))

others_param = filter(lambda p: id(p) not in head_param, model.parameters())

报错,于PyTorch可正常运行

optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)

optimizer = AdamW(params=model.parameters(), lr=fft_lr) # 可正常运行

Instantiate scheduler

lr_scheduler = get_linear_schedule_with_warmup( optimizer=optimizer, num_warmup_steps=0.06 (len(train_dataset) num_epochs), num_training_steps=(len(train_dataset) * num_epochs), )


- **Hardware Environment(`Ascend`/`GPU`/`CPU`)  / 硬件环境**:
> /device ascend/CPU

- **Software Environment / 软件环境 (Mandatory / 必填)**:
-- MindSpore version: 2.4.0
-- Python version: Python 3.9.19
-- OS platform and distribution: Windows 11
-- GCC/Compiler version (if compiled from source):

- **Excute Mode / 执行模式 (Mandatory / 必填)(`PyNative`/`Graph`)**:
> Please delete the mode not involved / 请删除不涉及的模式:
> /mode pynative
> /mode graph

**To Reproduce / 重现步骤 (Mandatory / 必填)**
以ia3模型为例:
1. 打开 'llm\peft\ia3\sequence_classification.ipynb'
2. 修改优化器的配置代码:
将原代码:

optimizer = AdamW(params=model.parameters(), lr=lr)

修改为:

head_param = list(map(id, model.classifier.parameters())) others_param = filter(lambda p: id(p) not in head_param, model.parameters()) head_lr = 6e-3 fft_lr = 6e-2
optimizer = AdamW([ {"params": model.classifier.parameters(), "lr": head_lr}, {"params": others_param, "lr": fft_lr} ],weight_decay=0.)


3. 运行代码进行训练
4. 在训练过程中报错
![image](https://github.com/user-attachments/assets/8cf7255e-32d7-4741-beb9-7541ffa09b08)

**Expected behavior / 预期结果 (Mandatory / 必填)**
模型能够正常训练

**Screenshots/ 日志 / 截图 (Mandatory / 必填)**
![bebdcfb9acd2f0b27a5dda837fbb8f58](https://github.com/user-attachments/assets/eefc9485-c684-47e3-982e-9076c7b2326c)

![image](https://github.com/user-attachments/assets/8303f62c-42e6-4ec6-b976-6125f8433818)

**Additional context / 备注 (Optional / 选填)**
Add any other context about the problem here.
lvyufeng commented 1 week ago

使用最新包解决:

from mindnlp.core import value_and_grad

grad_fn = value_and_grad(forward_fn, model.parameters())

for data in dataset:
    optimizer.zero_grad()
    loss = grad_fn(**data)
    optimizer.step()