mandarjoshi90 / coref

BERT for Coreference Resolution
Apache License 2.0
440 stars 92 forks source link

Understanding the output #73

Open Rajmehta123 opened 3 years ago

Rajmehta123 commented 3 years ago

Question

@mandarjoshi90 @jkkummerfeld New to coreference resolution. Still exploring, learning the coreference output, and understanding the parameters.

I tried finding several resources but could find an explanation for the output.

GIven the following i/p, I appreciate it if someone could explain the o/p and how to infer the press.

Input

sentence = [Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen., Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers., Paul and Bill used a teletype terminal at their high school, Lakeside, to develop their programming skills on several time-sharing computer systems.]

Output:

Clusters: [((0, 2), 'Paul Allen'), ((19, 20), 'Allen'), ((29, 30), 'he'), ((38, 39), 'he'), ((44, 45), 'Paul')],

[((9, 11), 'Seattle , Washington'), ((27, 28), 'Seattle')],

[((31, 44), 'Bill Gates , two years younger , with whom he shared an enthusiasm for computers'), ((46, 47), 'Bill')],

[((44, 47), 'Paul and Bill'), ((52, 53), 'their'), ((58, 59), 'their')]

Mentions: ((0, 11), 'Paul Allen was born on January 21 , 1953 , in Seattle , Washington ,'),

((0, 19), 'Paul Allen was born on January 21 , 1953 , in Seattle , Washington , to Kenneth Sam Allen and Edna Faye Allen'),

((3, 4), 'born'),

((5, 8), 'January 21 , 1953 ,'),

((5, 11), 'January 21 , 1953 , in Seattle , Washington ,'),

((9, 11), 'Seattle , Washington ,'),

((10, 11), 'Washington'),

((12, 19), 'Kenneth Sam Allen and Edna Faye Allen'),

((16, 19), 'Edna Faye Allen'),

((20, 21), 'attended'),

((21, 28), 'Lakeside School , a private school in Seattle'),

((21, 28), 'Lakeside School , a private school in Seattle ,'),

((21, 44), 'Lakeside School , a private school in Seattle , where he befriended Bill Gates , two years younger , with whom he shared an enthusiasm for computers'),

((39, 40), 'shared'),

((40, 44), 'an enthusiasm for computers'),

((48, 51), 'a teletype terminal'),

((48, 66), 'a teletype terminal at their high school , Lakeside , to develop their programming skills on several time - sharing computer systems'),

((52, 56), 'their high school , Lakeside ,'),

((52, 66), 'their high school , Lakeside , to develop their programming skills on several time - sharing computer systems'),

((57, 58), 'develop'),

((58, 61), 'their programming skills'),

((62, 66), 'several time - sharing computer systems')

jkkummerfeld commented 3 years ago

For an introduction to the task, see this chapter from Dan Jurafsky's book:

https://web.stanford.edu/~jurafsky/slp3/22.pdf

The output you have shows clusters and mentions, with numbers for token positions in the text (Python style slices, so (0, 2) gives you the first two tokens).