egerber / spaCy-entity-linker

spaCy module for linking text to Wikidata items
MIT License
215 stars 32 forks source link

Are categories or span.labels_ retrievable? #5

Closed jayhpatel14 closed 3 years ago

jayhpatel14 commented 3 years ago

Is there a way to retrieve an EntityElement's category from wikidata or it's span label? When I try to do for example:

entity.getspan().label it only prints out a blank line.

I'm asking because it says in documentation that "the package allows to easily find the category behind each entity (e.g. "banana" is type "food" OR "Microsoft" is type "company")".

egerber commented 3 years ago

You need to access get_super_entities()

import spacy  # version 3.0.6'

# initialize language model
nlp = spacy.load("en_core_web_md")

# add pipeline (declared through entry_points in setup.py)
nlp.add_pipe("entityLinker", last=True)

doc = nlp("Microsoft was founded by Bill Gates")

original_label=doc._.linkedEntities[0].label
category_labels=[entity.label for entity in doc._.linkedEntities[0].get_super_entities()]

print("original label: {}".format(original_label))
print("category labels: {}".format(category_labels))

In the given example, the output should be:

original label: Microsoft
category labels: ['software company', 'business', 'enterprise']