Add linker ensemble to allow using different linkers and different descriptions to improve the performance.
Proposed solution
Implementation of LinkerEnsemble which takes as input the list of linkers to use, the strategy (one of: max, count) and the threshold (to save entities).
It will group the entities by the name, and create combinations of them to extract with each of the linkers that set of entities, to finally group the results.
Example:
import spacy
from zshot import PipelineConfig
from zshot.linker import LinkerSMXM, LinkerTARS
from zshot.linker.linker_ensemble import LinkerEnsemble
from zshot.utils.data_models import Entity
from zshot import displacy
nlp = spacy.blank("en")
config = PipelineConfig(
entities=[
Entity(name="fruits", description="The sweet and fleshy product of a tree or other plant."),
Entity(name="fruits", description="Names of fruits such as banana, oranges"),
Entity(name="vitamin", description="A nutrient that the body needs in small amounts to function " \
"and stay healthy"),
Entity(name="vitamin", description="Vitamins are substances that our bodies need to develop and " \
"function normally")
],
linker=LinkerEnsemble(
linkers=[
LinkerSMXM(),
LinkerTARS(),
],
threshold=0.25
)
)
nlp.add_pipe("zshot", config=config, last=True)
# annotate a piece of text
doc = nlp('Apple or oranges have a lot of vitamin C.')
# Visualize the result
displacy.render(doc, style='ent')
Scenario summary
Add linker ensemble to allow using different linkers and different descriptions to improve the performance.
Proposed solution
Implementation of
LinkerEnsemble
which takes as input the list of linkers to use, the strategy (one of:max
,count
) and the threshold (to save entities).It will group the entities by the name, and create combinations of them to extract with each of the linkers that set of entities, to finally group the results.
Example: