THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Apache License 2.0
5.39k stars 444 forks source link

指令微调报错:A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer. #633

Open 1938225289 opened 4 weeks ago

1938225289 commented 4 weeks ago

System Info / 系統信息

transformers 4.46.0,python3.10

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True) tokenizer.padding_side = 'left' # 设置左侧填充

Expected behavior / 期待表现

有人遇到相同情况吗?

sixsixcoder commented 4 weeks ago

你用的哪个模型,用的哪个微调脚本

zRzRzRzRzRzRzR commented 4 weeks ago

这个地方不应该出现

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True,padding_side = 'left')

尝试这样呢

1938225289 commented 3 weeks ago

你用的哪个模型,用的哪个微调脚本

9B-Chat,lora微调,我将finetune.py修改了,用来适配指令微调数据集格式。

1938225289 commented 3 weeks ago

这个地方不应该出现

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True,padding_side = 'left')

尝试这样呢 这种方法我试过,还是会出现一样的问题。