egerber / spaCy-entity-linker

spaCy module for linking text to Wikidata items
MIT License
215 stars 32 forks source link

Linking entities to underlying spans #20

Closed dennlinger closed 1 year ago

dennlinger commented 1 year ago

So far, EntityLinker only allows access to the document-level or sentence-level EntityCollections. For me, a particularly useful pattern is direct access through arbitrary underlying Spans.

Span-level attributes are already possible in the library (thanks to the sentence-level support, because sentences in spacy are also just represented as a Span). This PR simply adds the individual EntityElement to its associated span.

Example usage:

text = ""
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("entityLinker", last=True)
doc = nlp(text)

# Previously simply returned "None"
print(doc[2:4]._.linkedEntities)  # <EntityElement: ...>

This is also useful for mappings over entities extracted with spacy, as described in #18.
One minor nitpick of my own change: Now, ._.linkedEntities no longer returns a consistent type (previously always returned an EntityCollection, now also returns EntityElement in cases of spans.

Again, happy to discuss the necessity of this change for the library first :)
Best,
Dennis

MartinoMensio commented 1 year ago

Hi @dennlinger, Yes, this is much needed in this library! I will double-check and then merge. It is really useful!

Martino