Closed caufieldjh closed 7 months ago
The following PR resolves this issue where recognized entities are queried for exact match in solr: Persistent and probable are not being annotated. However, "severe" was identified as PhenotypicFeature as shown below:
{
"text": "Severe",
"tokens": [
{
"id": "HP:0012828",
"category": "biolink:PhenotypicFeature",
"name": "Severe",
"full_name": null,
"deprecated": null,
"description": "Having a high degree of severity. For quantitative traits, a deviation of between four and five standard deviations from the appropriate population mean.",
"xref": [],
"provided_by": "phenio_nodes",
"in_taxon": null,
"in_taxon_label": null,
"symbol": null,
"synonym": [
"Severe"
],
"uri": null
}
],
"start": 653,
"end": 659
},
In using the text annotator with the abstract of this case report: https://pubmed.ncbi.nlm.nih.gov/38130915/ some modifiers, like
persistent
orprobable
are incorrectly parsed as disease or gene entities.The phrase
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)
looks like this in the output, as an additional example:That's a tricky one because
severe
is still a modifier but also the in the name of the disease. Overall, I wouldn't expectsevere
alone to be an entity, and in an ideal world it would be linked to the disease (and in some cases there may even be a more appropriate entity that way)