cooelf / SemBERT

Semantics-aware BERT for Language Understanding (AAAI 2020)
MIT License
285 stars 55 forks source link

Allennlp预测SRL结果不一致 #12

Closed deanyan7 closed 4 years ago

deanyan7 commented 4 years ago


如 The new rights are nice enough

样本测试所给的结果是 {"verbs": [{"verb": "are", "description": "[ARG1: The new rights] [V: are] [ARG2: nice enough]", "tags": ["B-ARG1", "I-ARG1", "I-ARG1", "B-V", "B-ARG2", "I-ARG2"]}], "words": ["The", "new", "rights", "are", "nice", "enough"]}

而allennlp预测出来的结果是 [{'verbs': [], 'words': ['The', 'new', 'rights', 'are', 'nice', 'enough']}]

allennlp 0.8.1 allennlp-models=1.0.0
也测试过 allennlp 1.0.0 allennlp-models=1.0.0

cooelf commented 4 years ago

这个预测结果似乎模型没有有效执行,印象中没有正确识别动词的话会出现全空的情况。 这个是使用的提供的数据处理吗?(online or offline版本?)请提供详细的操作流程以便重现下。

deanyan7 commented 4 years ago


cooelf commented 4 years ago



Related Issue:

deanyan7 commented 4 years ago

pytorch版本为1.5.0 我采用了您提供的 srl-model-2018.05.25.tar.gz,allennlp==0.8.1 spacy==2.2.4 也采用了allennlp-demo提供的bert-base-srl-2020.03.24.tar.gz 在 allennlp==1.0.0 allennlp==1.0.0 均出现此类问题

deanyan7 commented 4 years ago

复现: allennlp==0.8.1 spacy==2.2.4 from allennlp.models import load_archive from allennlp.predictors import Predictor archive = load_archive("/model/srl-model-2018.05.25.tar.gz",cuda_device=0) predictor = Predictor.from_archive(archive) predictor.predict(sentence)

或者 allennlp==1.0.0 allennlp-models==1.0.0 from allennlp.models import load_archive from allennlp.predictors import Predictor archive = load_archive("model/bert-base-srl-2020.03.24.tar.gz",cuda_device=0) predictor = Predictor.from_archive(archive) predictor.predict(sentence)

sentence = "yeah i know and i did that all through college and it worked too" result = {'verbs': [{'verb': 'know', 'description': 'yeah [ARG0: i] [V: know] and i did that all through college and it worked too', 'tags': ['O', 'B-ARG0', 'B-V', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']}, {'verb': 'worked', 'description': 'yeah i know and i did that all through college and [ARG1: it] [V: worked] [ARGM-ADV: too]', 'tags': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-ARG1', 'B-V', 'B-ARGM-ADV']}], 'words': ['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too']}

样本结果: {"verbs": [{"verb": "know", "description": "yeah [ARG0: i] [V: know] and i did that all through college and it worked too", "tags": ["O", "B-ARG0", "B-V", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O"]}, {"verb": "did", "description": "yeah i know and [ARG0: i] [V: did] [ARG1: that] [ARGM-TMP: all through college] and it worked too", "tags": ["O", "O", "O", "O", "B-ARG0", "B-V", "B-ARG1", "B-ARGM-TMP", "I-ARGM-TMP", "I-ARGM-TMP", "O", "O", "O", "O"]}, {"verb": "worked", "description": "yeah i know and i did that all through college and [ARG0: it] [V: worked] [ARGM-ADV: too]", "tags": ["O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "O", "B-ARG0", "B-V", "B-ARGM-ADV"]}], "words": ["yeah", "i", "know", "and", "i", "did", "that", "all", "through", "college", "and", "it", "worked", "too"]}

cooelf commented 4 years ago

我试了下不同spacy的版本在给出verb标签的时候有些区别,可能导致了SRL模型对谓词的识别问题。可以换成早期的spacy的版本(如2.0.18,并重新安装python -m spacy download en_core_web_sm)



import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("The new rights are nice enough")
print([token.text for token in doc])
print([token.pos_ for token in doc])


spacy 2.0.18 ['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too'] ['INTJ', 'PRON', 'VERB', 'CCONJ', 'PRON', 'VERB', 'DET', 'DET', 'ADP', 'NOUN', 'CCONJ', 'PRON', 'VERB', 'ADV']

['The', 'new', 'rights', 'are', 'nice', 'enough'] ['DET', 'ADJ', 'NOUN', 'VERB', 'ADJ', 'ADV']

spacy 2.2.4 ['The', 'new', 'rights', 'are', 'nice', 'enough'] ['DET', 'ADJ', 'NOUN', 'AUX', 'ADJ', 'ADV']

['yeah', 'i', 'know', 'and', 'i', 'did', 'that', 'all', 'through', 'college', 'and', 'it', 'worked', 'too'] ['INTJ', 'PRON', 'VERB', 'CCONJ', 'PRON', 'AUX', 'SCONJ', 'DET', 'ADP', 'NOUN', 'CCONJ', 'PRON', 'VERB', 'ADV']

deanyan7 commented 4 years ago
