mindspore-lab / mindnlp

Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
https://mindnlp.cqu.ai/
Apache License 2.0
706 stars 200 forks source link

MarkupLM模型训练问题 #1822

Open yegoling opened 1 week ago

yegoling commented 1 week ago

Describe the bug/ 问题描述 (Mandatory / 必填) 将dataloader换成torch里面的dataloader后,更改里面的张量为ms格式,输入MarkupLM模型训练,前向传播输出依然有问题

CPU

To Reproduce / 重现步骤 (Mandatory / 必填) 运行训练代码,训练markuplm-base模型,则会发现前向传播输出有问题

Expected behavior / 预期结果 (Mandatory / 必填) 输出正常

Screenshots/ 日志 / 截图 (Mandatory / 必填)

import mindspore as ms
import numpy as np
for batch in dataloader:
    for item in batch:
        batch[item]= batch[item].numpy()
        batch[item]=ms.from_numpy(batch[item])
    # print(batch)
    inputs = {k:v for k,v in batch.items()}
    # print(inputs)
    outputs = model(**inputs)
    print(outputs)

输出:

TokenClassifierOutput(loss=Tensor(shape=[], dtype=Float32, value= nan), logits=Tensor(shape=[2, 512, 4], dtype=Float32, value=
[[[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]],
 [[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]]]), hidden_states=None, attentions=None)
TokenClassifierOutput(loss=Tensor(shape=[], dtype=Float32, value= nan), logits=Tensor(shape=[2, 512, 4], dtype=Float32, value=
[[[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]],
 [[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
...
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]]]), hidden_states=None, attentions=None)

Additional context / 备注 (Optional / 选填) mindspore有问题的代码和输出正常,用来对照的pytorch代码如下:

mindspore代码: mindspore.md torch代码: torch.md