Closed sstemann closed 8 months ago
To fix the ARAX infores URL, I have made a PR to the Biolink project area: https://github.com/biolink/biolink-model/pull/1403
In response to your questions, @sstemann:
Yes, I believe that Monarch Initiative and SRI Reference Knowledge Graph API are identical. My investigative work suggests that Monarch Initiative may have been an earlier version of the SRI Reference Knowledge Graph API, one that was incorrectly named, given that Monarch Initiative isn't even a knowledge source. This is a known issue that we are working to resolve.
There are multiple issues here.
a. DrugBank and Drug Repurposing Hub - wiki pages exists, but the URLs in the infores catalog will need to be updated
- id: infores:drugbank
status: released
name: DrugBank
xref:
- http://www.drugbank.ca/
synonym:
- Drugbank
knowledge level: curated
agent type: not_provided
description: >-
A comprehensive, free-to-access, online database containing information
on drugs and drug targets. As both a bioinformatics and a cheminformatics resource,
we combine detailed drug (i.e. chemical, pharmacological and pharmaceutical) data
with comprehensive drug target (i.e. sequence, structure, and pathway) information
- id: infores:drug-repurposing-hub
status: released
name: Drug Repurposing Hub
xref:
- https://clue.io/repurposing
knowledge level: curated
agent type: not_provided
description: >-
Curated and annotated collection of FDA-approved drugs, clinical
trial drugs, and pre-clinical tool compounds with a companion information resource
b. Orphanet Rare Disease Ontology - a wiki page will need to be created and the URL in the infores catalog will need to be updated
- id: infores:ordo
status: released
name: Orphanet Rare Disease Ontology
xref:
- https://bioportal.bioontology.org/ontologies/ORDO
synonym:
- ORDO
knowledge level: curated
agent type: not_provided
- id: infores:orphanet
status: released
name: Orphanet
xref:
- https://www.orpha.net
knowledge level: curated
agent type: not_provided
c. Expander Agent (ARAX) has a wiki page, but ARAX should not be surfacing as a primary knowledge source. Same with ARAGORN, Service Provider, and Unsecret. If ARAs decide to surface their reasoning agent as a primary knowledge source, then they should be pointing to their agent, not their team, as the source, and they should not be naming the source, e.g., "Unsecret Agent OpenAPI for NCATS Biomedical Data Translator Reasoners". This is also a known issue. All four of the above ARAs are aware of the issue.
Just so you are aware, the infores / wiki effort is an ongoing one, with very few persons contributing. Moreover, issues have surfaced and/or changes have been made that require attention / action by other teams, which we cannot control.
For the initial public release, we focused on those primary knowledge sources that were being returned by the ARS in response to select MVP1 (GARD) and MVP2 queries. Moving forward, the plan is described in tab two C10 here.
Hope this helps ...
I might need some help from EvanDietzMorris to answer about the difference between infores:sri-reference-kg
, infores:automat-renci-sri-reference-kg
and infores:monarchinitiative
and what it takes for one of those to show up in the UI.
I don't know if it's practical, but it might be nice to phase out the name SRI Reference Graph, which I think historically was a biolink model conversion of the Dipper generated Monarch Initiative graph (and therefore had to be differentiated), but as we've rebuilt our graph pipeline, the monarch graph and the SRI reference graph are the same thing.
The two things that I'm not sure about:
Is the primary_knowledge_source preserved from the monarch-kg KGX files as it passes through ORION? Or is it replaced with infores:monarchinitiative?
Is there a (still?) a KP that's built using api.monarchinitiative.org? This endpoint is still up, but hasn't been updated for 2 years and has a limited lifetime that it will stay up. (The data files that backed it will still be hosted and available, of course)
@kevinschaper @EvanDietzMorris : You may find this G-sheet helpful, as it contains recent (late October) ARS/UI results from select MVP1/MVP2 queries.
Note that I think that infores:automat-renci-sri-reference-kg
has been deprecated, but infores:sri-reference-kg
and infores:monarchinitiative
remain active. There's also infores:sri-ontology
.
If you all decide to phase out infores:sri-reference-kg
in favor of infores:monarchinitiative
, I'd suggest that you rename that latter infores:monarch-kg
or infores:monarch-initiative-kg
.
Copying my response to the other issue about this:
This is due to the (admittedly complicated) fact that we have a version of the monarch sri-reference-kg that we ingest for robokop, which is a subset of the real sri-reference-kg (but made from an older version of the graph that did not include the actual primary knowledge sources, so it gets infors:sri-reference-kg assigned as the primary source). So - it's a bit of a mess currently we have two redundant infores ids in the catalog for this content (infores:monarchinitiative and infores:sri-reference-kg) but neither should be returned as a primary knowledge source, it should always be an aggregator, assuming every edge in the content has it's own primary knowledge source. We'll need to rebuild our version of this graph, and to pick which infores we want to use as the proper aggregator knowledge source for this content.
And adding here: The graph I get from Kevin should currently end up with the following EPC for an edge: primary knowledge source: whatever is on the edge aggregator knowledge source 1: infores:monarchinitivate aggregator knowledge source 2: infores:automat-sri-reference-kg
Sounds like we need to:
Thanks for the clarification, Evan.
Those of us who are working on the infores / wiki reconciliation and update effort (me, Andy, Sierra, Matt, Carrie) would appreciate it if you did not refer to infores:monarchinitiative
as an aggregator knowledge source. The reason is that monarchinitiative
is a group/org, not a KG. If possible, and should you choose to consider infores:monarchinitiative
as the aggregator knowledge source, then perhaps change the infores id to infores:monarchinitiative-kg
or infores:monarch-initiative-kg
.
I don’t have any strong opinions here. Sounds like something like this would work? Both the monarch initiative team and renci need to rebuild graphs before this could be deployed though.
infores:monarch-initiative-kg (as the first aggregator for everything coming from this kg) infores:automat-monarch-initiative-kg (as a second aggregator for the whole graph hosted on automat) infores:automat-renci-monarch-initiative-kg (as a second aggregator for the renci version)
Oh! I'm putting infores:monarchinitiative as an aggregator knowledge source on nearly everything in monarch-kg. ( not the subset that comes from phenio, but that feels like a bug I should fix.)
I like the idea of infores:monarch-kg
(+ infores:automat-monarch-kg
& infores:automat-renci-monarch-kg
), just omitting the -initiative part, because we haven't ever referred to the graph that way before.
fwiw, it seems like the KG should come with infores:monarch-kg populated for all edges, and then the automat & automat-renci pipelines should only add their own names to the aggregator list.
Also, are the aggregator values already present in the KG preserved? For example, we get OMIM & Orphanet via HPOA files, so we use infores:omim
& infores:orphanet
as primary, and add infores:hpo-annotations
as an aggregator (along with infores:monarchinitiative
- which I can switch to infores:monarch-kg
)
fwiw, it seems like the KG should come with infores:monarch-kg populated for all edges, and then the automat & automat-renci pipelines should only add their own names to the aggregator list.
This is how it works now, with the exception that if an edge is missing a primary knowledge source the automat infores will get assigned as one (which should really never be the case).
Also, are the aggregator values already present in the KG preserved?
This is currently a limitation of plater - because we don't have any examples of multiple aggregators chained before an edge gets into one of our knowledge graphs, we don't have a way to represent that which plater understands at the moment. We were planning to implement it very soon though. Currently only one aggregator can be specified per edge as "biolink:aggregator_knowledge_source". I'm curious how you have multiple ones represented now. Let's discuss on Slack - if we need to rebuild the graphs anyway, we can go ahead and implement a solution that will support chaining.
To clarify some after talking with Kevin I realized I misspoke some - we can support multiple aggregators in the biolink:aggregator_knowledge_source field, but it assumes that they parallel and not chained together. Either way I think we know what needs to be done here:
After some discussion we have decided to remove the RENCI version of the graph from automat completely (though we will still be including edges from it within the robokop kg). This will eliminate complexity and confusion stemming from having two versions etc.
So now we just need to change the infores ids to infores:monarch-kg and infores:automat-monarch-kg and rebuild the monarch kg with edges with the new infores. I'll remove "biolink" our version of this from plater and any infores ids associated with that.
I would vote to leave the infores id as infores:monarchinitiative
vs changing it to infores:monarch-kg
. Changing this id would necessitate me trying to update several other infores sources currently defined at the "organization" level vs. at the "kg" level, as well as handling several deprecations of identifiers (and managing the dissemination of the deprecation through the rest of the KPs that might use those deprecated identifiers), and making sure we define "kg" at a granular enough level so that everyone is on the same page about what "-kg" means.
What if we add the "-kg" version to the synonym
field? Would that be a good compromise?
This is the current infores:monarchinitiative
stanza, does the wiki URL need to change? :
- id: infores:monarchinitiative
status: released
name: Monarch Initiative
xref:
- https://github.com/NCATSTranslator/Translator-All/wiki/SRI-Reference-Knowledge-Graph
knowledge level: curated
agent type: not_provided
@sierra-moxon : Based on your post here and various Slack exchanges, I think you are fine with Evan moving forward with deprecation of the duplicate RENCI Monarch graph, but you have concerns about changing the infores id from infores:monarchinitiative
to infores:monarch-kg
, in part because you are not comfortable with the tag -kg
. Is that right? If so, then it sounds like Evan can move forward with deprecation of the duplicate RENCI Monarch graph but maybe leave the infores id as infores:monarchinitiative
for the time being, until a broader discussion takes place.
Regardless, the URL will need to be changed/retitled, but I can create a PR for that as part of the infores / wiki effort.
Thanks @karafecho - that would be great, and it does sum up my comment nicely. :)
We have already removed our version of the graph from automat, but edges with “sri-reference-kg” will still be coming from robokop until we build robokop again. For the future those edges will just have whatever comes directly from the monarch kg.
FYI: I created a PR to change the URLs for DrugBank, Drug Repurposing Hub, Orphanet Rare Disease Ontology (I created a new wiki page), and Monarch Intitiative (retitled from SRI Reference KG) and also deprecate infores:automat-renci-sri-kg.
Update: Sierra merged the PR.
Per discussion on the TAQA call on 11/17:
I've assigned myself to this ticket to keep track of this deployment, and will update once it's complete.
@EvanDietzMorris : A few of us, including Sarah and Sierra, met to discuss this ticket as part of TAQA. We decided to keep infores:monarchinitiative
and NOT change it to infores:monarchinitiative-kg
.
As such, I am closing this ticket.
Questions about source links (this query has some examples: https://ui.test.transltr.io/main/results?l=Aicardi%20Syndrome&i=MONDO:0010568&t=0&q=d1d73626-7167-40e4-b38b-5f9cfef698eb
Are Monarch Initiative and SRI Reference Knowledge Graph API the same source?![image](https://github.com/NCATSTranslator/Feedback/assets/38321826/ebc80562-5754-4944-9db9-2681f70d6501)
Why is there no wiki pages for: ARAX, DrugBank, Orphanet Rare Disease Ontology, Drug Repurposing Hub![image](https://github.com/NCATSTranslator/Feedback/assets/38321826/3d693d21-3da3-4d9b-bcc4-a810db8be5e4)
https://ui.test.transltr.io/main/results?l=Psoriasis&i=MONDO:0005083&t=0&q=c58a990c-1536-4c98-b454-20f260625407![image](https://github.com/NCATSTranslator/Feedback/assets/38321826/9574eb9b-69ce-4f22-b17a-82b7a019ee2c)