Closed capucincapucine closed 4 years ago
If you're using spacy-stanza
, the named entities predicted by the Stanza model are translated to spaCy's data structures. So entity spans are reflected in the doc.ents
and at the token level, just like in spaCy. If you need the token start and end, you could do:
print([(ent.text, ent.start, ent.end) for ent in doc.ents])
I was wondering whether is is possible to access named entities index at token level, for example: "Barack Obama was born in Hawaii." NE = Barack Obama NE_start : 0 NE_end : 2 I'm working on a project and need the start and end index of each named entity of a given sentence ; Spacy does provide entity index at token level (but does not provide named entity recognition at sentence level) while Stanza does provide named entity recognition at sentence level (but does not provide entity index at token level) so I'm not happy with either of them. I was able to somehow work my way through with the id attributes of token objects on Stanza but I'm stuck if named entities are made up of more than one token. Thank you in advance.