Open Fan-Luo opened 3 years ago
Thanks, I seem to be having the same problem while linking entities and mentioned across different sentences.
import spacy
import neuralcoref
from spacy import display
nlp = spacy.load("en_core_web_lg")
neuralcoref.add_to_pipe(nlp, greedyness=0.5, max_dist=200)
text = '''Every Tuesday and Friday, Recode’s Kara Swisher and NYU Professor Scott Galloway offer sharp, unfiltered insights into the biggest stories in tech, business, and politics. They make bold predictions, pick winners and losers, and bicker and banter like no one else. Kara is out welcoming the newest member of the Pivot family! Scott is joined by co-host Stephanie Ruhle to talk about The Great Resignation, inflation, J&J’s split, and Steve Bannon’s indictment. Also, Elon is still bullying senators on Twitter, and Beto is officially running for Governor of Texas. Plus, Scott chats with Friend of Pivot, Founder and CEO of Boom Supersonic, Blake Scholl about supersonic air travel.'''
doc = nlp(text)
sentence_spans = list(doc.sents)
displacy.render(sentence_spans, style="ent")
import tabulate
rows = []
for ent in doc.ents:
if ent.label_ != 'PERSON':
continue
row = [ent.text, ent.label_]
cluster = ent._.coref_cluster
if cluster is not None:
row.extend([cluster.main.text, cluster.mentions])
else:
row.extend([None, None])
rows.append(row)
table = tabulate.tabulate(rows, headers=["Entity", "Type", "Cluster id", "Cluster mentions"])
print(table)
I get the following output:
Entity Type Cluster id Cluster mentions
--------------- ------ ------------ ------------------
Kara Swisher PERSON
Scott Galloway PERSON
Kara PERSON
Pivot PERSON
Scott PERSON Scott [Scott, Scott]
Stephanie Ruhle PERSON
Steve Bannon’s PERSON
Elon PERSON
Beto PERSON
Scott PERSON Scott [Scott, Scott]
Friend of Pivot PERSON
Blake Scholl PERSON
I was expecting entity Kara Swisher
to get linked to mention Kara
and Scott Galloway
to Scott
as they are not that farther apart.
Hi,
Thank you for developing this extension. I wonder do current implementation has an option for cross sentence coreference resolution? My workaround is adding
, </sent> ,
as sentence separator. It seems works well for many cases, but sometimes ingest the separator, so I can not map back.For example: sents: Neil Mallon Pierce Bush (born January 22, 1955) is an American businessman and investor,
, </sent> ,
He is the fourth of six children of former President George H, W, Bush and Barbara Bush (née Pierce),, </sent> ,
His five siblings are George W, Bush, the 43rd President of the United States; Jeb Bush, a former governor of Florida; Robin Bush, died of leukemia at the age of three; Marvin; and Dorothy,, </sent> ,
Neil Bush is currently a businessman based in Texas, numbe of sents before coreference resolution: 4resolved_sents: Neil Mallon Pierce Bush (born January 22, 1955) is an American businessman and investor,
, </sent> ,
Neil Mallon Pierce Bush (born January 22, 1955) is the fourth of six children of former President George H, W, Bush and Barbara Bush (née Pierce),, </sent> ,
Neil Mallon Pierce Bush (born January 22, 1955) five siblings are George W, Bush, the 43rd President of the United States; Jeb Bush, a former governor of Florida; Barbara Bush, died of leukemia at the age of three; Marvin; and Dorothy, , Neil Mallon Pierce Bush (born January 22, 1955) Bush is currently a businessman based in Texas, number of sents after coreference resolution: 3I hope there is an existing solution I did not notice. If not, may I ask for your suggestion to fix my workaround?
Thank you