Closed MarkDWilliams closed 1 year ago
@cbizon @bill-baumgartner @webyrd i wasn't able to tag everyone in the Assignees, so just making sure you see this ticket for tomorrow (2/25) standup.
FWIW, I've been running a few simple queries to get some landscape:
https://arax.ncats.io/?r=b3ffebd5-95a6-4cc3-8007-8dc6bf6d2a17
{
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n01",
"object": "n00"
},
"e01": {
"subject": "n01",
"object": "n02"
}
},
"nodes": {
"n00": {
"ids": [
"PUBCHEM.COMPOUND:3121"
]
},
"n01": {
"categories": [
"biolink:NamedThing"
]
},
"n02": {
"ids": [
"NCBIGene:211"
]
}
}
}
}
}
https://arax.ncats.io/?r=3568272b-ac28-42f8-b47d-3d64a9db3da8
Here's a couple of results that look at "representative" intermediate nodes: ones that are more related/connected to ALAS1 and Valproic acid than other ones based on the Fisher exact test. Stringent filtering was placed due to the size of intermediate results (which would result in tens of thousands of paths), so any hint about what kinds of connections the SME is looking for would be helpful. Eg. what sort of intermediate node types are of interest or not of interest?
https://arax.ncats.io/?r=37356 https://arax.ncats.io/?r=37355
Those are written in ARAXi, but @finnagin can translate them to the Operations and Workflow language if there is interest (though under the hood it would be back-translated to ARAXi).
Note that the links I just posted are a basically @cbizon 's TRAPI here, with ARAXi tacked on.
I am tagging @suihuang-ISB as well. I know he wanted to follow this issue.
I also ran a few simple queries.
Query structure 1: ChemicalEntity (valproic acid) – [related_to] – Gene – [related_to] – DiseaseOrPhenotypicFeature (liver toxicity)
`{
"message": {
"query_graph": {
"edges": {
"e01": {
"object": "n0",
"subject": "n1",
"predicates": [
"biolink:related_to"
]
},
"e02": {
"object": "n1",
"subject": "n2",
"predicates": [
"biolink:related_to"
]
}
},
"nodes": {
"n0": {
"ids": [
"PUBCHEM.COMPOUND:3121"
],
"categories": [
"biolink:ChemicalEntity"
]
},
"n1": {
"categories": [
"biolink:Gene"
]
},
"n2": {
"ids": [
"MONDO:0005359"
],
"categories": [
"biolink:DiseaseOrPhenotypicFeature"
]
}
}
}
}
}
`
PK = 03933a7c-c96e-4edf-8693-480072d1e5e0
Summary and next steps: Results form ARAGORN, ARAX, BTE, and Text Miner. ALAS1 not among results. Five causal DILI genes (ERAP2, EXOC4, PTPN22, HLA-I, HLA-II) also not among results. Next steps include honing in on specific liver pathology of interest to SME and exploring the connection between ALAS1, which is expressed in the liver (and elsewhere) and encodes a mitochondrial enzyme involved with heme biosynthesis, and valproic acid, which is an anticonvulsant that carries a black box warning related to liver toxicity.
Query structure 2: ChemicalEntity (valproic acid) – [related_to] – Gene – [related_to] – DiseaseOrPhenotypicFeature (liver injury)
`{
"message": {
"query_graph": {
"edges": {
"e01": {
"object": "n0",
"subject": "n1",
"predicates": [
"biolink:related_to"
]
},
"e02": {
"object": "n1",
"subject": "n2",
"predicates": [
"biolink:related_to"
]
}
},
"nodes": {
"n0": {
"ids": [
"PUBCHEM.COMPOUND:3121"
],
"categories": [
"biolink:ChemicalEntity"
]
},
"n1": {
"categories": [
"biolink:Gene"
]
},
"n2": {
"ids": [
"NCIT:C26946"
],
"categories": [
"biolink:DiseaseOrPhenotypicFeature"
]
}
}
}
}
}
`
PK = 4108269a-2ca1-4d4f-9c63-efdf65c7c559
Summary and next steps: Results form ARAGORN, ARAX, BTE, and Explanatory. ALAS1 not among results. Five causal DILI genes (ERAP2, EXOC4, PTPN22, HLA-I, HLA-II) also not among results. Next steps include honing in on specific liver pathology of interest to SME and exploring the connection between ALAS1, which is expressed in the liver (and elsewhere) and encodes a mitochondrial enzyme involved with heme biosynthesis, and valproic acid, which is an anticonvulsant that carries a black box warning related to liver toxicity.
This is my process of incorporating the mitochondrial functions + ALAS1 + valproic acid. I focused on using the BTE tool...
Expand the steps below to see the queries/links/reasoning
Note:
DiseaseOrPhenotypicFeature
perhaps) or the "liver model" info yet. It would help to know of a specific pathology the SME is interested in, or what is being studied with the "liver model"...BiologicalProcessOrActivity
nodes I'm interested in taking for another hop (after an info page...I've had to look up terms)Both hepatic porphyria and acute liver failure link ALAS1 and VPA
From Eugene H.: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3696515/
Firstly, Prateek and I talked and we think I should be included in the assignees for these challenge questions, if possible.
I interpreted the challenge question in the same way that @colleenXu did. Namely, that the question revolved around shared processes/pathways/mechanisms. However, I'd like to add some additional commentary.
The challenge question seemed to include direct observations from the SME:
For example, I saw ALAS1 gene is clearly induced in a highly potent manner by valproic acid, and when taking the chemical away, the signal (along with many other genes) goes away, while others remain.
I'm assuming that these observations were a result of some kind of differential expression study, wherein details of the gene regulatory network are not known and we have only correlational evidence of valproic acid's effect on ALAS1, and not mechanistic evidence. However, it seems as though the SME was searching for a direct mechanistic connection from a Google search:
A quick Google search didn’t show a lot of connection between ALAS1 and valproic acid. However, mitochondrial function was linked to ALAS1 and valproic separately.
Assuming the challenge question is attempting to identify potential mechanisms by which valproic acid (VA) induces hepatotoxicity (which I did not see explicitly stated). I think we can approach this query from two hypotheses:
Please correct me if you see that I've made any incorrect associations here or if my logic is off for these hypotheses (might be worth bringing them up with the SME). But the basic requirements to start exploring these would use the following "1-hops" although ideally you'd want this to be a single, multi-hop query:
@ehinderer i will include you going forward, we'll be working on this one for a month
@ehinderer i will include you going forward, we'll be working on this one for a month
Thanks, I'll probably still need to confer with Prateek on technical issues, but I think I'd be more comfortable discussing the questions themselves.
Quick update from meeting with Steve F. on 02.28.2022:
A quick Medline search brings up this paper from 1988 which has the explicit answer in the introduction (plausible biochem/pharmacogical plausible mechanism). These are well know mechanisms, many steps are in standard biochem textbooks.
While that paper shows a link between valproic acid, ALAS-1, heme (iron protoporphyrin) biosynthesis, and porphyria, I don't think it directly addresses the heart of the question, which is mechanisms (metabolic alterations?) through which valproic acid somehow interacts with ALAS1 to cause hepatotoxicity. It also doesn't address the observation that an effect of valproic acid on ALAS1 is found only in 3D hepatic microtissue models, not 2D models.
Regardless, I think it will be interesting to see if Translator can identify the same linkages and mechanisms reported in the paper that Sui found by way of a Medline search, as well as the ones that Steve is proposing.
We have been working to determine an "answer" to the QOTM from which we can then work to understand where and how to uncover the data in Translator. We propose the following diagram that @suihuang-ISB put together as an answer:
We've been working backwards, mostly in SPOKE, to see if we can recreate it and determine the TRAPI queries that might pull it back from Translator. As of today, we still haven't pulled out the full network as we seem to be missing some connections. We are still working through the many CYP genes and considering additional enzymes and identifiers. We plan to continue evaluating why we might be missing some of these links and where we might find them in a database. The edge between P450 and heme is a particularly tricky one. Note also the multi-step feedback from ALA to heme.
Hopefully this is useful as a discussion point and a jumping off point for other teams.
Generally speaking, for "pathway" questions where the number of hops is unknown, a missing feature of Translator / TRAPI is the ability to specify an unknown length (https://github.com/NCATSTranslator/ReasonerAPI/issues/154), though I understand how tricky that could be to implement via one-hopping. It is worth saying that this has come up for vote multiple times in TRAPI prioritization discussions, but it has never been voted highly enough to be considered so it may not be a highly desired feature by the network overall.
@karafecho : Thanks for the additional details:
"I don't think it directly addresses the heart of the question, which is mechanisms (metabolic alterations?) through which valproic acid somehow interacts with ALAS1 to cause hepatotoxicity."
==> A "mechanisms" consists of (i) general biomedical principles and (ii) specific players that execute these principles. That 1988 paper provides the specific players for a well-known general principle of drug-induced CYP and metabolic changes. It is also well known that ALA (which accumulates if ALAS1 is upregulated) causes hepatotoxicty. Thus here the Translator would help to connect the dots (find the specific players). NOt sure if we also should expect it to remind users of the general principles that the expert user should know or are in medical textbooks (CYP induction by many drugs, heme utilization, liver toxicity of ALA accumulation...).
" It also doesn't address the observation that an effect of valproic acid on ALAS1 is found only in 3D hepatic microtissue models, not 2D models."
==> This is also a general principle that the expert will know without a Translator. A great majority of in vivo biology cannot be replicated in the artificial 2D cultures because cells are attached to plastic, which often dedifferentiates the cells, imitating stem-like state. In such 2D mono-layers on plastic, hepatocytes enter a wound-healing regenerative state that is not the same as the in vivo normal state which is much better recapitulated in 3D organoids where cells are not forced to attach to plastic. Thus, this is a generic principle.
In brief, I don't think these general biological principles will ever be captured but the Translator - but it also does not need to since they are all in textbooks. The Translator, in my view , is really to provide the specific players (often, newly discovered) and to connect the dots (here: which CYP does valproic acid upregulate? And maybe also to remind the user that heme normally represses ALAS, and of the fact that ALA is hepatotoxic.
relating to https://github.com/NCATSTranslator/testing/issues/176#issuecomment-1058679090, BTE can find connections between Valproic acid -> CYP genes -> heme -> ALAS1. There are other genes as well, for a total of 116 results. I've included BTE's TRAPI response here (can copy and paste into ARAX for viewing) VPA_gene_heme_ALAS.txt
However, ARS doesn't seem to be correctly querying BTE and other ARAs...EDIT: added response from ARS: https://arax.ncats.io/?r=6bb5e3aa-883c-4d7d-970c-e112d208afdc
I'm also wondering about the roles of these genes in oxidative stress, since valproic acid has been linked to hepatoxicity potentially through oxidative stress: see this article that implicates CYP2E1 and this article on mitochondrial ROS! and these two. Asking that the genes also be connected to oxidative stress...narrows the list down to 58 results.
I've included BTE's TRAPI response here (can copy and paste into ARAX for viewing) VPA_gene_heme_ALAS1_oxidativeStres.txt
EDIT: added response from ARS: https://arax.ncats.io/?r=2cd4bc49-f951-48f7-a1a5-2139b15c5c3e
One can also ask those genes to have pathway info, which narrows the result list down to 62. However, it's hard to browse since there's a lot of pathways. The SRI ID resolver also doesn't find labels for KEGG pathways, WIKIPATHWAYS, and biocarta.
I've included BTE's TRAPI response here (can copy and paste into ARAX for viewing) VPA_gene_heme_ALAS1_pathway.txt
EDIT: added response from ARS: https://arax.ncats.io/?r=18217b7f-dfd9-42e9-921f-f6a5a405a4c4
All the context and discussion above is great. But I think I'm still trying to understand the precise use case being explored here. I can think of these three possibilities:
1) SME believes the mechanism underlying VA's liver tox looks like VA -> ALAS1 -> GABA inhibition -> hepatic function -> liver tox, and Translator should find evidence supporting that hypothesis 2) SME believes that ALAS1 mediates the effect of VA on liver tox, and Translator should propose mechanistic details (and find roughly the mechanistic path above) 3) SME has identified many possible candidate genes (based on gene expression) that may mediate VA's effect on liver tox, and Translator should rank them (and find ALAS1 among the top ranked options)
I think any of these are plausible (and likely there are more plausible questions), but clearly the queries used to address them are quite different.
Agreee with Andrew that the discussion here is great!
I believe Steve Ferguson (SME) will be attending our 12 pm ET call, so let's discuss the scenarios that Andrew put forward when we meet.
from today's meeting, I'll bring up some technical issues:
Great dissection into the three possibilities @andrewsu. My slightly departing view: (1) and (2) are right on, they form, I would dare say, a well-defined disjoint but jointly exhaustive set of use cases. Or at least >90% of cases. WHich makes it a good list of possibilities. (2) Is exactly what entices many of us (biomedical researchers at ISB and collaborators) to use the Translator which is also what Dr. Ferguson said: "To connect the dots". (This is one key application and corresponds to the Workflow D. ) By contrast, (3) is not a distinct scenario but rather a sub-sequel of (2) - when many possible paths to "connect the dots" are returned. Of course ranking would be desirable, in some way - but as you know we all struggle with that and I don't think ranking with respect to likelihood of being true/relevant is at the moment within reach - sorry I am just cautious at this front (not that we shold not try our best to rank, though).
And notes on the querying. I get the sense from the SME that multiple steps would have been needed to get to the query structure I was showing today. This is close to the process of iterative querying I was doing too...
Thanks for everyone's input on the call today. I feel like that was a really valuable exercise, and hope others do as well. Given that Stephen might be able to join us for the longer mini-hackathon, I'm leaning towards not extending the next Friday call, but I do want to lay out an agenda of results to review and people familiar with those results to go over them. I think if we can keep somewhat tightly to time bounds on those, we'll be able to get through a larger diversity of approaches. So, as we take the feedback from today in mind and iterate on our queries over the next week, please let me know if you have results you'd like to share (or previously run results that we didn't have time for today) and I'll lay out a schedule for the next call to make sure we get to them all. Thanks!
Apologies in advance if these questions have obvious answers, as I'm trying to follow a discussion that I'm poorly educated on...
Should we be taking into consideration the fact that liver injury from VA is rare as we try to uncover the mechanism? From my limited understanding, the above queries would more likely uncover the mechanism if liver injury was common among patients taking VA. Should we be taking into consideration the risk factors for liver injury?
The risk of developing liver damage is greater in children who are younger than 2 years of age and are also taking more than one medication to prevent seizures, have certain inherited diseases that may prevent the body from changing food to energy normally, or any condition that affects the ability to think, learn, and understand. source
I believe the SME is interested in ALAS1 since it had one of the strongest signals in liver models, but prior to running these queries and finding these links, was there other reason to believe that ALAS1 is involved in the mechanism for liver injury among the patients who are experiencing it, or more specifically, is part of what differentiates patients who experience liver injury from those who don't? Or is that a separate question?
Thanks to everyone who joined the call with Steve. I thought the meeting went very well and provided an opportunity to really hone in on Steve's scientific mental model and question.
In addition, I thought that Steve provided some valuable general feedback to Translator: (1) space and time are important considerations; (2) dose and concentration are equally important considerations; (3) 'trusted' data soures such as WikiPathways and BioPlanet are of interest to folks in Steve's domain (toxicology). While we are all aware of these issues, and I am a huge proponent of (1), the fact that Steve emphasized them perhaps suggests a need to prioritize.
One other thing that I found valuable was Steve's feedback that Translator would have allowed him to more rapidly refine his hypothesis and focus his testing plan; in other words, improve efficiency (time, cost) and accelerate discovery...just as intended!
I also think the call with Steve and the discussion was very useful. But reminded me of one fundamental question we should keep in mind about what the translator is and is not:
(A) It is a TOOL to HELPS the SME to generate/confirm/expand/... their “mental model” (to use Kara’s term) (B) It seeks to GENERATE on its own said mental model for the SME. Here, the Translator performs the actual scientific reasoning.
Sure the boundaries are fuzzy, but I am afraid we are veering into the domain of (B). I think the Translator should focus on (A) and do it well: Just connect the dots (=nodes) based on existing edges, extracted from “trusted” knowledge sources (not data sources), not more. At times, it seemed to me the discussion went beyond that and was all about (B): many of us raised (scientifically interesting) issues that pertain to scientific reasoning that I would not expect the Translator to perform (e.g. on kinetics of responses to drugs, ir/reversibility of responses, idiosyncratic responses, mitochondrial functions, etc ...). Perhaps in the future. But currently, computational scientific discovery, first proposed by Herbert Simon, the father of AI, in the 1980s, is still in its infancy, notably in biomedicine, despite some notable well-articulated dreams. I guess this is not what the creator of the Translator had envisioned.
Here was my attempt at answering the question. It is a 2-parter and could probably benefit from continued analysis and running related queries. I started with a query to collect biological processes or activities related to ALAS1.
{
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n00",
"object": "n01",
"predicates":[
"biolink:related_to"
]
}
},
"nodes": {
"n00": {
"ids": [
"NCBIGene:211"
],
"categories":[
"biolink:Gene",
"biolink:Protein"
]
},
"n01": {
"categories":[
"biolink:BiologicalProcessOrActivity",
],
},
}
}
}
}
This was primarily used to extract the following pathways: "REACT:R-HSA-189451", "GO:0003870", "GO:0030170", "GO:0006782", "GO:0048821", "GO:0042541", "REACT:R-HSA-1592230", "REACT:R-HSA-189445", "GO:0042802", "REACT:R-HSA-400206", "GO:0016746", "GO:0005515", "GO:0033014", "UMLS:C0162565", "GO:0042168", "GO:0016747", "GO:0006779", "GO:0001666", "REACT:R-HSA-177160", "REACT:R-HSA-556833", "GO:0046501", "GO:0016749", "GO:0033013", "GO:0016748", "GO:0042440", "GO:0046148", "REACT:R-HSA-1592238", "PathWhiz:PW000176", "SMPDB:SMP0120764", "PathWhiz:PW121971", "PathWhiz:PW064604", "SMPDB:SMP0120770", "PathWhiz:PW121950", "PathWhiz:PW000698", "SMPDB:SMP0000484", "SMPDB:SMP0000223", "UMLS:C0597272", "MESH:C537236", "REACT:R-HSA-1989781", "REACT:R-HSA-1852241", "GO:0019842", "GO:0004109", "NCIT:C96833", "GO:0030097", "GO:0002520", "GO:0030099", "GO:0044255", "GO:0008289", "GO:0016246", "REACT:R-HSA-1592245", "UMLS:C2984402", "WIKIPATHWAYS:WP561", "GO:0002262", "UMLS:C2611249", "UMLS:C2610238", "UMLS:C1512401", "GO:0020027", "PathWhiz.Reaction:146672", "WIKIPATHWAYS:WP2875", "PathWhiz.Reaction:147775", "PathWhiz.Reaction:147829", "GO:0030703", "GO:0061515", "KEGG.PATHWAY:hsa00260", "GO:0050662", "PathWhiz.Reaction:1778", "PathWhiz.Reaction:146617", "GO:0003824", "GO:0048037", "GO:0051186", "BIOCARTA:ahsppathway", "UMLS:C1514238", "UMLS:C1514238", "GO:0034101", "GO:0048872", "PathWhiz.Reaction:148449", "PathWhiz.Reaction:149232", "GO:0016740", "REACT:R-HSA-2151201", "GO:0007304", "WIKIPATHWAYS:WP2882", "UMLS:C1512222", "KEGG.PATHWAY:hsa00860", "PathWhiz.Reaction:148504", "PathWhiz.Reaction:110250", "PathWhiz.Reaction:6637", "GO:0051188", "UMLS:C3549233", "PathWhiz.Reaction:1871", "UMLS:C2610972", "UMLS:C0700704", "UMLS:C0597295", "UMLS:C0002345", "MESH:D015533", "UMLS:C4476796".
To what degree these pathways are related to the question is unknown to me, but they do contain mitochondrial related pathways and are all ALAS1 related. I use these pathways to seed my next question:
{
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n00",
"object": "n01",
"predicates":[
"biolink:related_to"
]
},
"e03": {
"subject": "n02",
"object": "n01"
},
"e04": {
"subject": "n01",
"object": "n03",
},
},
"nodes": {
"n00": {
"ids": [
"NCBIGene:211"
],
"categories":[
"biolink:Gene",
"biolink:Protein"
]
},
"n01": {
"categories": [
"biolink:Gene",
"biolink:Protein"
],
"is_set": True
},
"n02": {
"categories":[
"biolink:BiologicalProcessOrActivity",
],
"ids":[
"REACT:R-HSA-189451", "GO:0003870", "GO:0030170", "GO:0006782", "GO:0048821", "GO:0042541",
"REACT:R-HSA-1592230", "REACT:R-HSA-189445", "GO:0042802", "REACT:R-HSA-400206", "GO:0016746",
"GO:0005515", "GO:0033014", "UMLS:C0162565", "GO:0042168", "GO:0016747", "GO:0006779", "GO:0001666",
"REACT:R-HSA-177160", "REACT:R-HSA-556833", "GO:0046501", "GO:0016749", "GO:0033013", "GO:0016748",
"GO:0042440", "GO:0046148", "REACT:R-HSA-1592238", "REACT:1989781", "REACT:1852241",
"PathWhiz:PW000176", "SMPDB:SMP0120764", "PathWhiz:PW121971", "PathWhiz:PW064604", "SMPDB:SMP0120770",
"PathWhiz:PW121950", "PathWhiz:PW000698", "SMPDB:SMP0000484", "SMPDB:SMP0000223", "UMLS:C0597272",
"MESH:C537236", "REACT:R-HSA-1989781", "REACT:R-HSA-1852241", "GO:0019842", "GO:0004109",
"NCIT:C96833", "GO:0030097", "GO:0002520", "GO:0030099", "GO:0044255", "GO:0008289", "GO:0016246",
"REACT:R-HSA-1592245", "UMLS:C2984402", "WIKIPATHWAYS:WP561", "GO:0002262", "UMLS:C2611249",
"UMLS:C2610238", "UMLS:C1512401", "GO:0020027", "PathWhiz.Reaction:146672", "WIKIPATHWAYS:WP2875",
"PathWhiz.Reaction:147775", "PathWhiz.Reaction:147829", "GO:0030703", "GO:0061515",
"KEGG.PATHWAY:hsa00260", "GO:0050662", "PathWhiz.Reaction:1778", "PathWhiz.Reaction:146617",
"GO:0003824", "GO:0048037", "GO:0051186", "BIOCARTA:ahsppathway", "UMLS:C1514238", "UMLS:C1514238",
"GO:0034101", "GO:0048872", "PathWhiz.Reaction:148449", "PathWhiz.Reaction:149232", "GO:0016740",
"REACT:R-HSA-2151201", "GO:0007304", "WIKIPATHWAYS:WP2882", "UMLS:C1512222", "KEGG.PATHWAY:hsa00860",
"PathWhiz.Reaction:148504", "PathWhiz.Reaction:110250", "PathWhiz.Reaction:6637", "GO:0051188",
"UMLS:C3549233", "PathWhiz.Reaction:1871", "UMLS:C2610972", "UMLS:C0700704", "UMLS:C0597295",
"UMLS:C0002345", "MESH:D015533", "UMLS:C4476796"
]
},
"n03": {
"ids": [
"PUBCHEM.COMPOUND:3121"
]
},
}
}
}
}
This is of the form:
which I used to extract the following results: https://arax.ncats.io/?r=9ea95d04-238e-4e6d-b836-ded294ddaf5c
The results I described in the standup today (3/11) were: ARAX Result 52 - containing GATA3 gene BTE Result 1 - containing PPARGC1A/PPARGC1B/PPARA and RXRA
I do not have the expertise to fully analyze these results and could use some help :)
@karafecho Does Dr. Ferguson have any other genes (e.g., ones like ALAS1 whose expression changes are reversed when the compound is washed out) or chemicals that he wants to add to this question?
during a previous standup, Dr. Ferguson mentioned the gene HMGCR and some specific statins that were linked to liver toxicity.
We are wondering what else to do with this question before the mini-hackathon...
@colleenXu : Yes! In fact, Steve is re-analyzing his liver assay data and will create a ‘consensus’ list of genes that were identified across experiments with valproic acid, placing them into ranked tiers of 1) potency bins, 2) fold-change bins, 3) fold/potency bins. The idea, I think, is that genes associated with higher potency and fold-change bins may reveal effects related to typical exposure levels, but those in the lower potency and fold-change bins may provide more mechanistic insights. We can discuss this with Steve tomorrow, as I'm a bit fuzzy on some of the details here.
At our last meeting, Steve F. asked if Translator has clinical real-world evidence related to valproic acid.
So, I ran this query:
{
"message": {
"query_graph": {
"nodes": {
"n0": {
"ids": ["PUBCHEM.COMPOUND:3121"],
"categories": [
"biolink:ChemicalEntity"
],
"name": "Valproic Acid"
},
"n1": {
"categories": [
"biolink:DiseaseOrPhenotypicFeature"
]
}
},
"edges": {
"e0": {
"subject": "n0",
"object": "n1",
"predicates": ["biolink:has_real_world_evidence_of_association_with"]
}
}
}
}
}
PK= c0d7c057-b9fd-4ee4-8bcb-08ad79cd984b
COHD was the only clinical KP to return results. Here's a summary of the top 50 results:
score 'guessence' 4.503 Schizoaffective disorder, manic type 4.440 Schizoaffective disorder, bipolar type 4.264 Bipolar affective disorder, current episode mixed 4.223 Mixed bipolar affective disorder, severe, with psychosis 4.217 Severe manic bipolar I disorder without psychotic features 4.211 Bipolar disorder, mixed 4.210 Bipolar affective disorder, currently manic, severe, with psychosis 4.191 bipolar disorder 4.178 Severe bipolar I disorder, single manic episode with psychotic features 4.171 Severe bipolar disorder with psychotic features 4.134 Refractory idiopathic generalized epilepsy 4.133 Severe mixed bipolar I disorder without psychotic features 4.104 Severe bipolar disorder without psychotic features 4.074 Schizophrenia, Disorganized 4.068 Severe bipolar disorder 4.056 manic bipolar affective disorder 4.046 Severe bipolar I disorder 4.008 Severe manic bipolar I disorder 4.000 antisocial personality disorder 3.995 Thoughts of violence 3.985 Mania 3.982 Severe depressed bipolar I disorder with psychotic features 3.978 bipolar I disorder 3.972 Severe mixed bipolar I disorder with psychotic features 3.969 Chronic schizoaffective schizophrenia 3.956 Severe mixed bipolar I disorder 3.925 schizoaffective disorder 3.916 Bipolar affective disorder, currently manic, moderate 3.907 Mood alterations with manic symptoms 3.904 Mood disorder of manic type 3.894 Schizoaffective and schizophreniform disorders 3.878 Bipolar I disorder, single manic episode 3.876 Mixed bipolar affective disorder, moderate 3.842 EIG1 3.831 Homicidal Ideation 3.811 Auditory hallucinations 3.807 Lennox-Gastaut syndrome 3.794 Schizo-affective type schizophrenia, chronic state with acute exacerbation 3.785 Strange and inexplicable behavior 3.758 Homicidal behavior 3.756 Status epilepticus due to intractable idiopathic generalized epilepsy 3.747 Status epilepticus due to refractory epilepsy 3.746 Severe depressed bipolar I disorder without psychotic features 3.741 Severe depressed bipolar I disorder 3.731 Status epilepticus due to generalized idiopathic epilepsy 3.687 epilepsy with generalized tonic-clonic seizures 3.686 epilepsy, idiopathic generalized 3.679 Bilateral tonic-clonic seizure 3.661 visual epilepsy 3.647 Bipolar disorder in partial remission
The individual results provide additional information such as observed-expected ratios with CIs, etc.
@CaseyTa : I believe that COHD only exposes the top 50 answers to ARS queries. Any chance you can run the query above locally and see if anything interesting surfaces? FWIW, valproic acid is outside of scope for the current ICEES instances.
The Multiomics Wellness dataset has too few instances of individuals prescribed valproic acid, so no significant (and privacy-preserving) correlations could be derived.
@karafecho Yes, COHD was limiting the results to the top 50. I increased the limit to 500 and added a second hop to DILI to focus the results a bit more: https://arax.ncats.io/?r=c927aaac-ac25-4d56-bd09-23303968406e
{
"message": {
"query_graph": {
"nodes": {
"n0": {
"ids": ["PUBCHEM.COMPOUND:3121"],
"categories": [
"biolink:ChemicalEntity"
],
"name": "Valproic Acid"
},
"n1": {
"categories": [
"biolink:DiseaseOrPhenotypicFeature"
]
},
"n2": {
"ids": ["MONDO:0005359", "SNOMEDCT:197354009"],
"categories": ["biolink:DiseaseOrPhenotypicFeature"],
"name": "DILI"
}
},
"edges": {
"e0": {
"subject": "n0",
"object": "n1",
"predicates": ["biolink:has_real_world_evidence_of_association_with"]
},
"e1": {
"subject": "n1",
"object": "n2",
"predicates": ["biolink:has_real_world_evidence_of_association_with"]
}
}
}
}
}
We have the node, but no edges were considered important enough. We'll revisit threshold now we have a new predicate.
So I think that part of the question is how is ALAS1 related to the liver disease/injury sometimes caused by valproic acid. The main query I ran didn't focus too much on the valproic acid side of this:
query={
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n01",
"object": "n00"
},
"e01": {
"subject": "n01",
"object": "n02"
},
"e02": {
"subject": "liver",
"object": "n00"
}
},
"nodes": {
"liver":
{
"ids": ["UBERON:0002107"]
},
"n00": {
"categories": ["biolink:DiseaseOrPhenotypicFeature"]
},
"n01": {
"categories": [
"biolink:Gene"
]
},
"n02": {
"ids": [
"NCBIGene:211"
]
}
}
}
}
}
So it's ALAS1-Gene-Disease-Liver.
The PK is https://arax.ncats.io/?source=ARS&id=26ae39c7-d601-4315-b978-5a48d4596677
I've only looked at the ARAGORN results
There are a number of results where the disease returned is "Liver disease" and the relation between the gene and disease is something about regulation or assocation. The genes that come out of that are: answer 1: entity_regulates_entity: WASF3, CLPX, TSPAN8, BCEN1, STK3, ORM1, JAG1, ITPKA, LIPA answers 2, 5: gene_associated_with_condition: ALAD, PPOX, HMBS, UROD, ALAS2, CPOX, FECH
There are also a number of answers where the disease is hepatocellular carcinoma or malaria. I mostly ignored those b/c I don't think that is relevant to the question (maybe wrong?)
Result 10 is interesting:
The disease is Liver Disease and ARAGORN has found that there's a subset of the genes that are associated specifically with inherited porphyria: ALAD, PPOX, HMBS, UROD, ALAS2. Porphyria relates to the Heme discussion from a couple of weeks ago: "Porphyria is a group of disorders caused by an overaccumulation of porphyrin which helps hemoglobin, the protein that carries oxygen in the blood."
Result 16 is along the same lines: that same set of genes is also related the GO term "heme biosynthetic process"
Result 23 shows that CPOX and FECH are also related to inherited porphyria
Results 36,37,38 show that the same set of genes is related to other porphyrin synthetic or metabolic processes.
If you want to dig further into porphyria, you can find out what chemicals it is correlated with:
query={
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n01",
"object": "porphyria",
"predicates": ["biolink:correlated_with"]
}
},
"nodes": {
"porphyria": {
"categories":
[ "biolink:Disease" ],
"ids": [
"MONDO:0037939"
]
},
"n01": {
"categories": [
"biolink:ChemicalEntity"
]
}
}
}
}
}
https://arax.ncats.io/?source=ARS&id=1d8a1a8a-6961-4114-bed5-24474df601eb
There are a bunch of things that are probably disease related like porphobilinogen and uroporphyris and also a bunch of things like statins (and valproic acid!). Presumably those are in there b/c they occasionally have porphyria as an adverse event.
You can clean that up a touch by asking adding ALAS1 to the above query, attached to the chemical. https://arax.ncats.io/?source=ARS&id=b7555415-f2fb-4c00-a044-d28d6e4499cc
In this case, the chemicals get filtered down to Porphobilinogen and aminolevulinic acid (and chloroquine :()
Result 14 is kind of interesting: it says that ALAS1 is linked to many forms of porphyria via Porhopbilinogen, and all of those forms have in common that they have an association with the gene HMBS.
Result 21 groups both Porphobilinogen and aminolevulnic acid based on their relation to HEM2 protein [HEM2_HUMAN Delta-aminolevulinic acid dehydratase ]
Final query is just like the previous one but with chemical related by biological process:
{
"message": {
"query_graph": {
"edges": {
"e00": {
"subject": "n01",
"object": "porphyria",
"predicates": ["biolink:correlated_with"]
},
"e01": {
"subject": "n01",
"object": "alas1qnode"
}
},
"nodes": {
"porphyria": {
"categories":
[ "biolink:Disease" ],
"ids": [
"MONDO:0037939"
]
},
"n01": {
"categories": [
"biolink:BiologicalProcessOrActivity"
]
},
"alas1qnode": {
"categories": ["biolink:Gene"],
"ids": ["NCBIGene:211"]
}
}
}
}
}
Edited to add PK: https://arax.ncats.io/?source=ARS&id=dc161f97-79e2-4d86-9823-b482fbd87f8b (Thanks for pointing out it was missing @colleenXu )
The processes are what you might expect at this point, things like heme and hemoglobin biosynthetic processes. Aminolevulinic acid and HMBS return as grouping nodes (results 11,12), as do a number of aminolevulinate synthase mitochondrial proteins (results 15-18):
So it seems that ALAS1 is related to heme/hemoglobin/porphyrin synthesis and metabolic processes, and messing with ALAS1 can therefore lead to various porphyrias. There are a number of genes/proteins implicated through this process. I am pretty sure that the sprot proteins showing up in these results are the same as a lot of the genes that showed up (HEM0_HUMAN = ALAS2 etc). Seems like they’ve should have been conflated, so I'm not sure why they're showing up this way.
Note: I think the link between ALAS1 and liver toxicity/damage has been explored in queries / articles / comments from earlier standups. However, Chris Bizon's more recent comments do describe an approach that focuses on it.
Here's another PK for Chris Bizon's first query in this comment: https://arax.ncats.io/?r=0dbbe4be-964c-4327-a1f2-d82aff469a3e BTE had a bug that we fixed, which is why there was previously an error. EDIT: there's still a bug, we're tracking it here.
I didn't see a PK for the last query in Chris Bizon's comment., so I ran it here: https://arax.ncats.io/?r=e401577f-86ce-4dd1-a2a6-98d656511d0a
EDIT: added a PK for Eugene Hinderer's first query in this comment: https://arax.ncats.io/?r=84c14bc9-30ad-46b6-af35-9e782e3d1b80
Yes, thanks for pointing that out @colleenXu . The big picture here is not particularly novel and is following closely onto what you and others have done upthread. This is in some ways more looking at how ARAGORN would explore a similar space. In terms of new information, I think maybe there's a set of genes that may or may not be interesting to the SME.
I am posting a list of 370 'consensus' gene transcripts that were observed up to ~10X Cmax across experiments with valproic acid. Steve F. provided the list and noted that while ALAS1 was the most potent in this set, it curiously was not observed in HepaRG (a surrogate to primary human hepatocytes used for these data).
I am posting a list of 370 'consensus' gene transcripts that were observed up to ~10X Cmax across experiments with valproic acid. Steve F. provided the list and noted that while ALAS1 was the most potent in this set, it curiously was not observed in HepaRG (a surrogate to primary human hepatocytes used for these data).
Is this list in order of potency?
interesting graphical abstract here, related to @brettasmi and @suihuang-ISB 's work in the earlier comment
https://www.sciencedirect.com/science/article/abs/pii/S0009912013002890?via%3Dihub
Mark wrote a query with the top 3 genes in Stephen Ferguson's list
https://arax.ncats.io/index.html?r=cb943e9f-9f1d-454e-996e-543ed3f68bf0
FWIW, I cross-referenced the list of genes that Chris B identified as related to both ALAS1 and liver disease with Steve F.'s list of potency-ranked 'consensus' genes and identified ORM1 (orosomucid 1, #202 on Steve's list, mutations associated with appendicitis and dry eye syndrome) and LIPA (lysosomal acid or lipase A, #237 on Steve's list, mutations associated with lysosomal acid lipase deficiency and cholesterol ester storage disease).
anybody interested in turning this list into identifiers?
I attempted to capture some of the suggested queries and Translator features that were discussed during yesterday's mini-hackathon with Steve F.
Suggested queries:
Suggested Translator features (either absent, partially implemented, or could be improved):
anybody interested in turning this list into identifiers?
Here's a start @cbizon. These are based on exact name matches in SPOKE. I suppose there could be errors, but it's unlikely.
@brettasmi @cbizon I saw some blanks in Brett's csv, so I filled them in manually using Genecards.
Note that I used the SRI Name Resolver with "cannabidiol" to get the PUBCHEM.COMPOUND ID I use in the queries below
Looking for Genes related to valproic acid and CBD. I see genes that I remember from an earlier query between valproic acid and heme (CYP genes, oxidative stress genes).
https://arax.ncats.io/?r=1755b234-843c-4f51-88a4-aa1e8eb84b96 and BTE response, since there's a redis issue: CBD_gene_VPA.txt
I therefore did another query looking for genes related to valproic acid and CBD and heme.
https://arax.ncats.io/?r=64489bb1-db6a-4798-bae6-f26798309801 and BTE response, since there's a redis issue: CBD_gene_VPA_heme.txt
The use of the Friday standup calls to cover a more nuanced question in greater depth was discussed at the most recent Relay meeting, and this is the first of those questions of the month. Please use this ticket to discuss the question, potential approaches to formulating queries to address it, and any possible blocking issues. We can discuss these points further and touch base on our progress and next steps during the normally scheduled Friday stand up time.
Kick-off challenge question of the month:
Submitting team: SRI
SME: Stephen Ferguson, PhD, Scientist, Molecular Toxicology and Genomics Group within the Biomolecular Screening Branch of the National Toxicology Program at the National Institute of Environmental Health Sciences Background and challenge question: “I have started to connect the dots a bit between genes that are reversibly changed in vitro for liver models versus clinical and mechanistic data. … For example, I saw ALAS1 gene is clearly induced in a highly potent manner by valproic acid, and when taking the chemical away, the signal (along with many other genes) goes away, while others remain. A quick Google search didn’t show a lot of connection between ALAS1 and valproic acid. However, mitochondrial function was linked to ALAS1 and valproic separately. So, I’m wondering what might be the best ways to do this kind of next-level search that would connect chemical-gene targets-mechanisms/pharmacology-pathology across these four data types.”