allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.66k stars 223 forks source link

Discrepancy in the bc5cdr ner model output for different spacy versions and different OS #492

Closed Arjuman23 closed 9 months ago

Arjuman23 commented 11 months ago

I have identified a discrepancy in the entities detected by the "en_ner_bc5cdr_md-0.5.1" model between results obtained from a Windows system and an Ubuntu system. According to the readme file of the "en_ner_bc5cdr_md-0.5.1" model, it is trained up to Spacy version 3.5.0. Interestingly, this alignment holds true for the Windows system. Whenever I adjust the Spacy version to a value above 3.5.0, the named entity recognition (NER) results are no longer produced. The model en_ner_bc5cdr_md-0.5.0 worked irrespective of the spacy version.

However, an interesting scenario emerged when I conducted the same experiment on an Ubuntu system. Here, the "en_ner_bc5cdr_md-0.5.1" model generated NER outputs regardless of the Spacy version I employed. I even tested it with versions like 3.6.1 and even lower than that.

This leads me to the question: Why is this discrepancy in behavior occurring between the Windows and Ubuntu systems? Is this a known issue? Am I missing something??

dakinggg commented 9 months ago

I don't know why there would be a discrepancy between Ubuntu and Windows, but using versions of spacy that are not supported by a model is undefined behavior.