IBM / zshot

Zero and Few shot named entity & relationships recognition
https://ibm.github.io/zshot
MIT License
350 stars 20 forks source link

[Bug] Mentions are override with Entities when both have first and last item in common #25

Closed marmg closed 2 years ago

marmg commented 2 years ago

Summary

When using both mentions and entities in PipelineConfig, if they have the same first and last element the hash will be the same, thus the mentions will be overridden with the entities.

To Reproduce

import spacy
from zshot import Zshot, PipelineConfig
from zshot.utils.data_models import Entity

nlp = spacy.blank("en")

nlp_config = PipelineConfig(
    mentions=[
        Entity(name="first entity", description="First Entity"),
        Entity(name="second entity", description="Second Entity"),
        Entity(name="third entity", description="Third Entity")
    ],
    entities=[
        Entity(name="first entity", description="First Entity"),
        Entity(name="other second entity", description="Different Second Entity"),
        Entity(name="third entity", description="Third Entity")    
    ]
)
nlp.add_pipe("zshot", config=nlp_config, last=True)

print(nlp.get_pipe('zshot').mentions)

This will print:

[Entity(name='first entity', description='First Entity', vocabulary=None),
 Entity(name='other second entity', description='Different Second Entity', vocabulary=None),
 Entity(name='third entity', description='Third Entity', vocabulary=None)]

Expected behavior It should print:

[Entity(name='first entity', description='First Entity', vocabulary=None),
 Entity(name='second entity', description='Second Entity', vocabulary=None),
 Entity(name='third entity', description='Third Entity', vocabulary=None)]