Open cbizon opened 3 years ago
This is true for many other types as well.
We completed a lot of identifier mappings prior to the adoption of standards by Translator. My understanding was that 'extra' prefixes weren't an issue?
Many of the ones in the list I posted are clearly wrong either in formatting or type.
I am unsure whether it's legal to provide extras but if there is a reason that these are legal prefixes then they should be added to the model so that others know they are valid. Otherwise nobody will call you using them.
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: karafecho @.> Sent: Monday, June 14, 2021 10:32:06 AM To: NCATS-Tangerine/icees-api @.> Cc: Bizon, Christopher A @.>; Author @.> Subject: Re: [NCATS-Tangerine/icees-api] Illegal prefixes in /meta_knowledge_graph (#125)
We completed a lot of identifier mappings prior to the adoption of standards by Translator. My understanding was that 'extra' prefixes weren't an issue?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/NCATS-Tangerine/icees-api/issues/125#issuecomment-860732812, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACR7EHVVNBCJUCUXRBGAIADTSYHGNANCNFSM46S6YBDA.
@patrickkwang is the key here valid? "biolink:Disease, DiseaseOrPhenotypicFeature"
?
@cbizon No, the key is bad.
Would the code in plater for creating /meta_knowledge_graph be helpful?
we have fixed some of the typo. still need to check if all identifiers are accepted by trapi now.
There's still a lot of things that I'm sure won't validate. You can use the biolink model toolkit (or bl-lookup service) to easily find out what the allowed prefixes for a given concept are.
It might also make sense to validate annotation curies. It seems as though there are annotations that lack a ":" and are ending up creating new prefixes like "PUBCHEM123124". That's my guess anyway.
Also, just to be clear, problems with meta_knowledge_graph are currently preventing strider from getting information from ICEES.
Take the query Asthma -[correlated with]-> chemical.
Right now, strider returns no ICEES results, and it will not until this is fixed.
The problem is not only the prefixes, but their order, which is meaningful.
Here is the id_prefixes for Disease:
"biolink:Disease": {
"id_prefixes": [
"ICD10R",
"ICD10",
"UMLS",
"UMLSCUIC0023895",
"OMIM",
"UMLSCUI",
"SCTID",
"MONDO",
"SCITD",
"MESH",
"ICD9",
"NCIT",
"CHEBI",
"HP",
"CPT",
"PUBCHEM",
"UMLSCUOI",
"LOINC",
"IC10"
]
},
What this means is: ICEES wants you to send it ICD10R codes. If the concept doesn't have an ICD10R code, it wants you to send an ICD10 code, and if it doesn't have that, please send it a UMLS, etc etc.
According to nodenorm, here are the possible values for asthma:
https://nodenormalization-sri.renci.org/1.1/get_normalized_nodes?curie=MONDO%3A0004979
So, it doesn't have an ICD10R code (which is not a biolink allowed prefix anyway). But it does have an ICD10, ICD10:J45
. Strider compares the nodenorm results to the meta_knowledge_graph results, and says "oh, ICEES wants this ICD10 code" and sends that query.
However, ICEES doesn't return any results for that ICD10 code. It does return results for the MONDO code. End result: no ICEES results for strider.
@cbizon : Thanks for the detailed explanation. Very helpful! As it turns out, neither Hao nor I were aware that the order of the prefixes mattered. We will fix that. We're also still working to fix the CURIES.
Just so you are aware, we are in the process of overhauling the way we handle our API config files, so while the issues delineated here and elsewhere are taking a while to resolve, things should proceed much more smoothly moving forward.
e.g.
Many of these are not allowed prefixes in the biolink model, including UMLSCUI, CUMLSCUIOI, etc.