AutoModelForSequenceClassification可能有bug

fengyunflya commented 4 months ago

System Info / 系統信息

在使用AutoModelForSequenceClassification时，model(**input)后报错runtimeError: mat1 and mat2 must have the same dtype, but got BFloat16 and Half。原来是model.classifier_head的类型是float16。另外结果输出的并不是做sentence classification，而是token classification。

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

[ ] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

model = AutoModelForSequenceClassification.from_pretrained
tokenizer = AutoTokenizer.from_pretrained
inputs = tokenizer(["Hello, I love food. It is great"], return_tensors='pt')
output = model.forward(**inputs)

Expected behavior / 期待表现

Just want to know why

fengyunflya commented 4 months ago

另外在加载模型后，print(model)，显示最后有2层Linear Layer，但是dim对不上，所以应该是只有1层head layer。但不知道为什么打印出来是2层。

duzx16 commented 4 months ago

dtype 的问题已经 fix 了 (https://huggingface.co/THUDM/glm-4-9b-chat/commit/0deb1dd1717354614688b73617e63d444d942dbe) 结果输出是 token classification 而不是 sequence classification 的问题在之前的 commit 也修复了 (https://huggingface.co/THUDM/glm-4-9b-chat/commit/12c80499bc07e21c05c11ee2a6035371bf53f1a6)

THUDM / GLM-4

AutoModelForSequenceClassification可能有bug #197

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现