ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.57k stars 1.38k forks source link

关于chinese-roberta-wwm-ext-large模型的问题 #98

Closed pxxgogo closed 4 years ago

pxxgogo commented 4 years ago

您好,我在用您的chinese-roberta-wwm-ext-large模型做MLM任务时发现好像有bug。我分别尝试过google/bert的inference代码以及huggingface的Transformers工具的inference代码,好像都有明显的问题。以下是调用Transformers的代码:


from transformers import *
import torch
from torch.nn.functional import softmax

tokenizer = BertTokenizer.from_pretrained("hfl/chinese-roberta-wwm-ext-large")
model = AutoModelWithLMHead.from_pretrained("hfl/chinese-roberta-wwm-ext-large")

inputtext = "今天[MASK]情很好"

maskpos = tokenizer.encode(inputtext, add_special_tokens=True).index(103)
input_ids = torch.tensor(tokenizer.encode(inputtext, add_special_tokens=True)).unsqueeze(0)  # Batch size 1
outputs = model(input_ids, masked_lm_labels=input_ids)
loss, prediction_scores = outputs[:2]
logit_prob = softmax(prediction_scores[0, maskpos]).data.tolist()
predicted_index = torch.argmax(prediction_scores[0, maskpos]).item()
predicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]
print(predicted_token,logit_prob[predicted_index])

输出是:

##覆 0.00043045077472925186
ymcui commented 4 years ago

你好,可以参考:https://github.com/ymcui/Chinese-BERT-wwm/issues/76 另外,如果希望做完形填空的预测,可以用目标领域的自由文本做二次pretrain,效果上应该不会有太大损失。

pxxgogo commented 4 years ago

好的,感谢!

pxxgogo commented 3 years ago

@ymcui 哈喽,我再确认一下哈,是不是最新的huggingFaces里面hfl/chinese-bert-wwm-ext 和 hfl/chinese-robert-wwm-ext两个tf的模型也没有训练MLM层?我试了试发现结果都是错乱的?