allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.75k stars 2.25k forks source link

Coreference Resolution producing different output for same input #3989

Closed Troied closed 4 years ago

Troied commented 4 years ago

I wanted to get the same results for coref as that from the AllenNLP's demo website. So I went back to the same version of AllenNLP and models that is used in the demo, but the result seems to be different when I run it locally.

AllenNLP : commit:17c2ff1ce2cb5e84ab9a0f524e6c01362c242cae (https://github.com/allenai/allennlp/tree/17c2ff1ce2cb5e84ab9a0f524e6c01362c242cae)

Model Used : https://storage.googleapis.com/allennlp-public-models/coref-bert-lstm-2020.02.12.tar.gz

How to solve this issue ?

schmmd commented 4 years ago

@datix-vishu the demo today is currently using the model listed in the models.json: https://github.com/allenai/allennlp-demo/blob/master/models.json#L35. In other words, what you're doing is exactly what I would do to get identical results.

Can you give an example where the performance differs locally and on the website?

Troied commented 4 years ago

For example , try this text :

Original Text text =Russia formally charged former Marine Paul Whelan with espionage, according to a report Thursday. “An indictment has been presented. Whelan dismisses it,” Interfax news agency reported, citing a source.Whelan, a Michigan native, was arrested last Friday by members of the Russian Federal Security Service (FSB) who accused him of being on a “spy mission,” and he has since been detained at Lefortovo Prison in Moscow. US Ambassador to Russia Jon Huntsman met with Whelan at the prison Wednesday and talked to his family, the State Department said. Whelan’s family said the 48-year-old, who is director of global security for Michigan-based auto supplier BorgWarner, traveled to Moscow last month to attend the wedding of a fellow Marine veteran to a Russian woman. Whelan faces up to 20 years in prison if convicted. The FSB has not released information on why Whelan was arrested. Secretary of State Mike Pompeo said the US has demanded answers from Russia. “We’ve made clear to the Russians our expectation that we will learn more about the charges, come to understand what it is he’s been accused of and if the detention is not appropriate, we will demand his immediate return,” he said.

The output I'm getting ( local ) : Russia formally charged former Marine Paul Whelan with espionage, according to a report Thursday. “An indictment has been presented. former Marine Paul Whelan dismisses charged,” Interfax news agency reported, citing a source.Whelan, a Michigan native, was arrested last Friday by members of the Russian Federal Security Service (FSB) who accused former Marine Paul Whelan of being on a “spy mission,” and former Marine Paul Whelan has since been detained at Lefortovo Prison in Moscow. US Ambassador to Russia Jon Huntsman met with former Marine Paul Whelan at Lefortovo Prison in Moscow Wednesday and talked to former Marine Paul Whelan's family, the State Department said. his family said former Marine Paul Whelan, traveled to Moscow last month to attend the wedding of a fellow Marine veteran to a Russian woman. former Marine Paul Whelan faces up to 20 years in prison if convicted. the Russian Federal Security Service (FSB) has not released information on why former Marine Paul Whelan was arrested. Secretary of State Mike Pompeo said the US has demanded answers from Russia. “the US’ve made clear to the Russians the US's expectation that the US will learn more about the charges, come to understand what it is former Marine Paul Whelan’s been accused of and if arrested is not appropriate, the US will demand former Marine Paul Whelan's immediate return,” Secretary of State Mike Pompeo said.

schmmd commented 4 years ago

Can you send the raw output you get locally? For example, from the demo I get the following:

{"clusters":[[[3,6],[23,23],[60,60],[71,71],[90,90],[98,98],[106,107],[148,148],[167,167],[210,210],[226,226]],[[2,2],[25,25]],[[0,0],[85,85],[183,183]],[[77,80],[92,93]],[[98,99],[106,108]],[[80,80],[131,131]],[[50,57],[159,160]],[[44,44],[169,169],[217,218]],[[177,178],[186,186],[193,193],[196,196],[223,223]],[[171,175],[231,231]]],"document":["Russia","formally","charged","former","Marine","Paul","Whelan","with","espionage",",","according","to","a","report","Thursday",".","\u201c","An","indictment","has","been","presented",".","Whelan","dismisses","it",",","\u201d","Interfax","news","agency","reported",",","citing","a","source",".","Whelan",",","a","Michigan","native",",","was","arrested","last","Friday","by","members","of","the","Russian","Federal","Security","Service","(","FSB",")","who","accused","him","of","being","on","a","\u201c","spy","mission",",","\u201d","and","he","has","since","been","detained","at","Lefortovo","Prison","in","Moscow",".","US","Ambassador","to","Russia","Jon","Huntsman","met","with","Whelan","at","the","prison","Wednesday","and","talked","to","his","family",",","the","State","Department","said",".","Whelan","\u2019s","family","said","the","48-year","-","old",",","who","is","director","of","global","security","for","Michigan","-","based","auto","supplier","BorgWarner",",","traveled","to","Moscow","last","month","to","attend","the","wedding","of","a","fellow","Marine","veteran","to","a","Russian","woman",".","Whelan","faces","up","to","20","years","in","prison","if","convicted",".","The","FSB","has","not","released","information","on","why","Whelan","was","arrested",".","Secretary","of","State","Mike","Pompeo","said","the","US","has","demanded","answers","from","Russia",".","\u201c","We","\u2019ve","made","clear","to","the","Russians","our","expectation","that","we","will","learn","more","about","the","charges",",","come","to","understand","what","it","is","he","\u2019s","been","accused","of","and","if","the","detention","is","not","appropriate",",","we","will","demand","his","immediate","return",",","\u201d","he","said","."],"predicted_antecedents":[-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,7,-1,10,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,21,-1,-1,-1,25,-1,-1,-1,-1,33,-1,32,5,-1,-1,-1,4,-1,-1,-1,41,3,-1,-1,-1,-1,18,-1,-1,-1,-1,-1,-1,21,-1,36,-1,-1,4,-1,46,-1,-1,-1,-1,-1,36,-1,-1,5,-1,-1,2,3,-1,-1,-1,-1,79,-1,19,-1,12,-1,5,-1,24,-1],"slug":"MTYyMDg3Mg==","top_spans":[[0,0],[1,1],[2,2],[3,6],[8,8],[12,14],[14,14],[16,16],[16,18],[19,19],[21,21],[23,23],[24,24],[25,25],[27,30],[31,31],[33,33],[34,35],[44,44],[45,45],[45,46],[48,67],[48,69],[50,57],[59,59],[60,60],[62,62],[64,67],[69,69],[71,71],[75,75],[77,80],[80,80],[82,87],[85,85],[88,88],[90,90],[92,93],[94,94],[95,95],[96,96],[98,98],[98,99],[101,103],[104,104],[106,107],[106,108],[109,109],[110,128],[122,127],[129,129],[131,131],[132,133],[135,135],[136,142],[136,146],[139,142],[144,146],[148,148],[157,157],[159,160],[163,163],[164,169],[167,167],[168,168],[169,169],[171,175],[176,176],[177,178],[180,180],[181,181],[183,183],[185,185],[185,187],[186,186],[188,188],[191,192],[193,193],[196,196],[198,198],[201,202],[204,204],[208,208],[210,210],[213,213],[217,218],[219,219],[223,223],[225,225],[226,226],[230,230],[231,231],[232,232]]}
Troied commented 4 years ago

{'top_spans': [[0, 0], [1, 1], [2, 2], [3, 6], [8, 8], [12, 14], [14, 14], [16, 16], [19, 19], [21, 21], [23, 23], [24, 24], [25, 25], [27, 30], [31, 31], [33, 33], [34, 35], [44, 44], [45, 45], [45, 46], [48, 67], [48, 69], [50, 57], [59, 59], [60, 60], [62, 62], [64, 67], [69, 69], [71, 71], [75, 75], [77, 80], [80, 80], [82, 87], [85, 85], [88, 88], [90, 90], [92, 93], [94, 94], [95, 95], [96, 96], [98, 98], [98, 99], [101, 103], [104, 104], [106, 107], [106, 108], [109, 109], [110, 127], [110, 128], [122, 127], [129, 129], [131, 131], [132, 133], [135, 135], [136, 142], [136, 146], [139, 142], [144, 146], [148, 148], [157, 157], [159, 160], [163, 163], [164, 169], [167, 167], [168, 168], [169, 169], [171, 175], [176, 176], [177, 178], [180, 180], [181, 181], [183, 183], [185, 185], [186, 186], [188, 188], [191, 192], [193, 193], [196, 196], [198, 198], [201, 202], [204, 204], [208, 208], [209, 209], [210, 210], [213, 213], [217, 218], [219, 219], [223, 223], [225, 225], [226, 226], [230, 230], [231, 231], [232, 232]], 'predicted_antecedents': [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 6, -1, 9, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 20, -1, -1, -1, 24, -1, -1, -1, -1, 32, -1, 31, 5, -1, -1, -1, 4, -1, -1, -1, 40, 3, -1, 2, -1, -1, -1, 19, -1, -1, -1, -1, -1, -1, 22, -1, 37, -1, -1, 4, -1, 47, -1, -1, -1, -1, -1, 37, -1, 4, -1, -1, 2, 3, -1, -1, -1, -1, -1, 79, -1, 19, -1, 13, -1, 5, -1, 24, -1], 'document': ['Russia', 'formally', 'charged', 'former', 'Marine', 'Paul', 'Whelan', 'with', 'espionage', ',', 'according', 'to', 'a', 'report', 'Thursday', '.', '“', 'An', 'indictment', 'has', 'been', 'presented', '.', 'Whelan', 'dismisses', 'it', ',', '”', 'Interfax', 'news', 'agency', 'reported', ',', 'citing', 'a', 'source', '.', 'Whelan', ',', 'a', 'Michigan', 'native', ',', 'was', 'arrested', 'last', 'Friday', 'by', 'members', 'of', 'the', 'Russian', 'Federal', 'Security', 'Service', '(', 'FSB', ')', 'who', 'accused', 'him', 'of', 'being', 'on', 'a', '“', 'spy', 'mission', ',', '”', 'and', 'he', 'has', 'since', 'been', 'detained', 'at', 'Lefortovo', 'Prison', 'in', 'Moscow', '.', 'US', 'Ambassador', 'to', 'Russia', 'Jon', 'Huntsman', 'met', 'with', 'Whelan', 'at', 'the', 'prison', 'Wednesday', 'and', 'talked', 'to', 'his', 'family', ',', 'the', 'State', 'Department', 'said', '.', 'Whelan', '’s', 'family', 'said', 'the', '48-year', '-', 'old', ',', 'who', 'is', 'director', 'of', 'global', 'security', 'for', 'Michigan', '-', 'based', 'auto', 'supplier', 'BorgWarner', ',', 'traveled', 'to', 'Moscow', 'last', 'month', 'to', 'attend', 'the', 'wedding', 'of', 'a', 'fellow', 'Marine', 'veteran', 'to', 'a', 'Russian', 'woman', '.', 'Whelan', 'faces', 'up', 'to', '20', 'years', 'in', 'prison', 'if', 'convicted', '.', 'The', 'FSB', 'has', 'not', 'released', 'information', 'on', 'why', 'Whelan', 'was', 'arrested', '.', 'Secretary', 'of', 'State', 'Mike', 'Pompeo', 'said', 'the', 'US', 'has', 'demanded', 'answers', 'from', 'Russia', '.', '“', 'We', '’ve', 'made', 'clear', 'to', 'the', 'Russians', 'our', 'expectation', 'that', 'we', 'will', 'learn', 'more', 'about', 'the', 'charges', ',', 'come', 'to', 'understand', 'what', 'it', 'is', 'he', '’s', 'been', 'accused', 'of', 'and', 'if', 'the', 'detention', 'is', 'not', 'appropriate', ',', 'we', 'will', 'demand', 'his', 'immediate', 'return', ',', '”', 'he', 'said', '.'], 'clusters': [[[3, 6], [23, 23], [60, 60], [71, 71], [90, 90], [98, 98], [106, 107], [110, 127], [148, 148], [167, 167], [210, 210], [226, 226]], [[2, 2], [25, 25]], [[0, 0], [85, 85], [183, 183]], [[77, 80], [92, 93]], [[98, 99], [106, 108]], [[80, 80], [131, 131]], [[50, 57], [159, 160]], [[44, 44], [169, 169], [217, 218]], [[177, 178], [186, 186], [193, 193], [196, 196], [223, 223]], [[171, 175], [231, 231]]]}

schmmd commented 4 years ago

Here are the cluster differences:

Demo:

[
    [[3, 6], [23, 23], [60, 60], [71, 71], [90, 90], [98, 98], [106, 107], [148, 148], [167, 167], [210, 210], [226, 226]],
    [[2, 2], [25, 25]],
    [[0, 0], [85, 85], [183, 183]],
    [[77, 80], [92, 93]],
    [[98, 99], [106, 108]],
    [[80, 80], [131, 131]],
    [[50, 57], [159, 160]],
    [[44, 44], [169, 169], [217, 218]],
    [[177, 178], [186, 186], [193, 193], [196, 196], [223, 223]],
    [[171, 175], [231, 231]]
]

@datix-vishu :

[
    [[3, 6], [23, 23], [60, 60], [71, 71], [90, 90], [98, 98], [106, 107], [110, 127], [148, 148], [167, 167], [210, 210], [226, 226]],
    [[2, 2], [25, 25]],
    [[0, 0], [85, 85], [183, 183]],
    [[77, 80], [92, 93]],
    [[98, 99], [106, 108]],
    [[80, 80], [131, 131]],
    [[50, 57], [159, 160]],
    [[44, 44], [169, 169], [217, 218]],
    [[177, 178], [186, 186], [193, 193], [196, 196], [223, 223]],
    [[171, 175], [231, 231]]
]

Note that @datix-vishu's output contains [110, 127].

Troied commented 4 years ago

I too have noticed that, but I'm unable to find a solution for that. Do you have any idea about the same ?

schmmd commented 4 years ago

@ZhaofengWu any ideas why datix-vishu's model output is different between the demo and locally? It seems like they should be the same. Please note that I didn't try to replicate his results locally.

Troied commented 4 years ago

This is my code:

from allennlp.predictors import Predictor coref_model = Predictor.from_path('C:/Users/Hacker/Desktop/Work/Untitled Folder/coref-bert-lstm-2020.02.12.tar.gz') coref_model.predict(text)

ZhaofengWu commented 4 years ago

@datix-vishu The following is what I'm getting locally which is the same as the demo. I'm guessing maybe you have some environment issue?

{'top_spans': [[0, 0], [1, 1], [2, 2], [3, 6], [8, 8], [12, 14], [14, 14], [16, 16], [16, 18], [19, 19], [21, 21], [23, 23], [24, 24], [25, 25], [27, 30], [31, 31], [33, 33], [34, 35], [44, 44], [45, 45], [45, 46], [48, 67], [48, 69], [50, 57], [59, 59], [60, 60], [62, 62], [64, 67], [69, 69], [71, 71], [75, 75], [77, 80], [80, 80], [82, 87], [85, 85], [88, 88], [90, 90], [92, 93], [94, 94], [95, 95], [96, 96], [98, 98], [98, 99], [101, 103], [104, 104], [106, 107], [106, 108], [109, 109], [110, 128], [122, 127], [129, 129], [131, 131], [132, 133], [135, 135], [136, 142], [136, 146], [139, 142], [144, 146], [148, 148], [157, 157], [159, 160], [163, 163], [164, 169], [167, 167], [168, 168], [169, 169], [171, 175], [176, 176], [177, 178], [180, 180], [181, 181], [183, 183], [185, 185], [185, 187], [186, 186], [188, 188], [191, 192], [193, 193], [196, 196], [198, 198], [201, 202], [204, 204], [208, 208], [210, 210], [213, 213], [217, 218], [219, 219], [223, 223], [225, 225], [226, 226], [230, 230], [231, 231], [232, 232]], 'predicted_antecedents': [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 7, -1, 10, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 21, -1, -1, -1, 25, -1, -1, -1, -1, 33, -1, 32, 5, -1, -1, -1, 4, -1, -1, -1, 41, 3, -1, -1, -1, -1, 18, -1, -1, -1, -1, -1, -1, 21, -1, 36, -1, -1, 4, -1, 46, -1, -1, -1, -1, -1, 36, -1, -1, 5, -1, -1, 2, 3, -1, -1, -1, -1, 79, -1, 19, -1, 12, -1, 5, -1, 24, -1], 'document': ['Russia', 'formally', 'charged', 'former', 'Marine', 'Paul', 'Whelan', 'with', 'espionage', ',', 'according', 'to', 'a', 'report', 'Thursday', '.', '“', 'An', 'indictment', 'has', 'been', 'presented', '.', 'Whelan', 'dismisses', 'it', ',', '”', 'Interfax', 'news', 'agency', 'reported', ',', 'citing', 'a', 'source', '.', 'Whelan', ',', 'a', 'Michigan', 'native', ',', 'was', 'arrested', 'last', 'Friday', 'by', 'members', 'of', 'the', 'Russian', 'Federal', 'Security', 'Service', '(', 'FSB', ')', 'who', 'accused', 'him', 'of', 'being', 'on', 'a', '“', 'spy', 'mission', ',', '”', 'and', 'he', 'has', 'since', 'been', 'detained', 'at', 'Lefortovo', 'Prison', 'in', 'Moscow', '.', 'US', 'Ambassador', 'to', 'Russia', 'Jon', 'Huntsman', 'met', 'with', 'Whelan', 'at', 'the', 'prison', 'Wednesday', 'and', 'talked', 'to', 'his', 'family', ',', 'the', 'State', 'Department', 'said', '.', 'Whelan', '’s', 'family', 'said', 'the', '48-year', '-', 'old', ',', 'who', 'is', 'director', 'of', 'global', 'security', 'for', 'Michigan', '-', 'based', 'auto', 'supplier', 'BorgWarner', ',', 'traveled', 'to', 'Moscow', 'last', 'month', 'to', 'attend', 'the', 'wedding', 'of', 'a', 'fellow', 'Marine', 'veteran', 'to', 'a', 'Russian', 'woman', '.', 'Whelan', 'faces', 'up', 'to', '20', 'years', 'in', 'prison', 'if', 'convicted', '.', 'The', 'FSB', 'has', 'not', 'released', 'information', 'on', 'why', 'Whelan', 'was', 'arrested', '.', 'Secretary', 'of', 'State', 'Mike', 'Pompeo', 'said', 'the', 'US', 'has', 'demanded', 'answers', 'from', 'Russia', '.', '“', 'We', '’ve', 'made', 'clear', 'to', 'the', 'Russians', 'our', 'expectation', 'that', 'we', 'will', 'learn', 'more', 'about', 'the', 'charges', ',', 'come', 'to', 'understand', 'what', 'it', 'is', 'he', '’s', 'been', 'accused', 'of', 'and', 'if', 'the', 'detention', 'is', 'not', 'appropriate', ',', 'we', 'will', 'demand', 'his', 'immediate', 'return', ',', '”', 'he', 'said', '.'], 'clusters': [[[3, 6], [23, 23], [60, 60], [71, 71], [90, 90], [98, 98], [106, 107], [148, 148], [167, 167], [210, 210], [226, 226]], [[2, 2], [25, 25]], [[0, 0], [85, 85], [183, 183]], [[77, 80], [92, 93]], [[98, 99], [106, 108]], [[80, 80], [131, 131]], [[50, 57], [159, 160]], [[44, 44], [169, 169], [217, 218]], [[177, 178], [186, 186], [193, 193], [196, 196], [223, 223]], [[171, 175], [231, 231]]]}

Troied commented 4 years ago

Can you please share the model you have used for the above result. Also, it will be really helpful if you could share what version of spacy you have used.

In my case, I'm using : OS : Windows 10, 64bit AllenNLP : https://github.com/allenai/allennlp/tree/17c2ff1ce2cb5e84ab9a0f524e6c01362c242cae Model : https://storage.googleapis.com/allennlp-public-models/coref-bert-lstm-2020.02.12.tar.gz Spacy : 2.1.0 torch : 1.4.0 overrides : 2.8.0 nltk : 3.4.5 tensorboardX : 1.9 boto3 : 1.9.190 requests : 2.18.4 tqdm : 4.31.1 h5py : 2.9.0 scikit-learn : 0.20.4 scipy : 1.2.0 pytest : 5.0.1 flaky : 3.6.1 responses : 0.10.6 conllu : 2.2.2 transformers : 2.4.0 jsonpickle : 1.2

ZhaofengWu commented 4 years ago

I'm using spacy 2.2.3 and transformers 2.4.1. Also we don't officially support Windows yet which could be a reason

Troied commented 4 years ago

Oh I see. thanks for your time brother. There is one more doubt I have right now. Does the current demo use a different version of allenNLP for fine-grained-NER and another version for coref ?

ZhaofengWu commented 4 years ago

I think that's right. See https://github.com/allenai/allennlp-demo/blob/master/models.json#L44-L48

@schmmd?

Troied commented 4 years ago

Updating Spacy to version 2.2.3 solved the issue.

@schmmd and @ZhaofengWu , Thanks for your valuable time brothers. This package is truly amazing and fun to work with. Can't wait to see v1.0 !!

Once again thanks for your support, have a good day :-)