TranslatorSRI / NodeNormalization

Service that produces Translator compliant nodes given a curie
MIT License
9 stars 6 forks source link

Incomplete normalization of 'abdomen' #188

Closed amykglen closed 9 months ago

amykglen commented 1 year ago

I noticed that the NodeNormalizer doesn't group the UBERON identifier for the concept 'abdomen' with other 'abdomen' identifiers:

https://nodenormalization-sri.renci.org/1.3/get_normalized_nodes?curie=UBERON:0000916&conflate=true

"UBERON:0000916": {
"id": {
"identifier": "UBERON:0000916",
"label": "abdomen"
},
"equivalent_identifiers": [
{
"identifier": "UBERON:0000916",
"label": "abdomen"
}
],

https://nodenormalization-sri.renci.org/1.3/get_normalized_nodes?curie=UMLS:C0000726&conflate=true

"UMLS:C0000726": {
"id": {
"identifier": "UMLS:C0000726",
"label": "Abdomen"
},
"equivalent_identifiers": [
{
"identifier": "UMLS:C0000726",
"label": "Abdomen"
},
{
"identifier": "MESH:D000005",
"label": "Abdomen"
}
],

and interestingly, there's an additional separate cluster with another 'abdomen' identifier, plus a couple 'abdominal cavity' ids: https://nodenormalization-sri.renci.org/1.3/get_normalized_nodes?curie=UMLS:C0230168&conflate=true

"UMLS:C0230168": {
"id": {
"identifier": "UMLS:C0230168",
"label": "Abdominal Cavity"
},
"equivalent_identifiers": [
{
"identifier": "UMLS:C0230168",
"label": "Abdominal Cavity"
},
{
"identifier": "MESH:D034841",
"label": "Abdominal Cavity"
},
{
"identifier": "NCIT:C12664",
"label": "Abdomen"
}
],

should all of the 'abdomen' ids go in one cluster and all of the 'abdominal cavity' ids go in another?

gaurav commented 9 months ago

Updated URLs:

So the good news is that the first two cliques have become combined in the current release. Yay!

NodeNorm still considers UBERON:0000916 "abdomen" and UBERON:0003684 "abdominal cavity" to be separate concepts (UBERON models it as a part of some abdomen), and NCIT:C12664 "abdomen" is still part of abdominal cavity. However, the definition of NCIT:C12664 suggests to me that it refers to the cavity, and not the abdomen, and both NCIT and UBERON have this cross-reference. So I think modeling them as two distinct cliques is correct.

I'm going to close this issue for now, but please do reopen it if I'm misunderstood something or if we should take this up with UBERON or NCIT!