allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.66k stars 223 forks source link

Spcay recoginize similar words into different entities #501

Closed LeiGong0125Carrot closed 7 months ago

LeiGong0125Carrot commented 8 months ago

Hello everyone,

I used the following code to do entity recognition in the MIMIC discharge_summary dataset.

nlp= spacy.load("en_core_sci_sm")

nlp.add_pipe("scispacy_linker", config={"resolve_abbreviations": True, "linker_name": "umls"}) linker = nlp.get_pipe("scispacy_linker")

similar_list = ["spinal", "spinals", "Some SPINALS", "one SPINAL", "bulbar", "bulbars", "BULBAR", "BULBARS"] for sent in similar_list: doc = nlp(sent) entity = doc.ents[0]

print("Name: ", entity)
entity = doc.ents[0]
print("Name: ", entity)
for umls_ent in entity._.kb_ents:
    print(linker.kb.cui_to_entity[umls_ent[0]])
print("-----"*15)
dakinggg commented 7 months ago

Hi, I'm not exactly sure what the question is, but generally speaking, these are imperfect machine learning models, and will make mistakes.