NCATSTranslator / testing

Materials and tools for testing Translator components
1 stars 9 forks source link

Question of the Month #2: β-sitosterol and Covid 19 #185

Closed MarkDWilliams closed 11 months ago

MarkDWilliams commented 2 years ago

Background and Challenge Question

Blood metabolomics profiling of healthy and COVID19 patients uncovered that patients with severe COVID-19 (WHO score 5-7) had significantly elevated plasma levels of β-sitosterol (CHEMBL221542), a phytosterol found in most plants that is not synthesized in animals (see figure).

image1 LEGEND: Plasma levels of β-sitosterol in healthy controls and COVID19 patients, divided for this study into two groups ‘MODERATE’ (mild-moderate, WHO severity scores 1-4) and ‘SEVERE {WHO score 5-7) with the indicated number of individuals per group (n). Raw data from measurement by Metabolon in arbitrary units (y-axis). ‘Healthy’ are historical controls from an independent study (age and gender matched). Each COVID19 data-point (symbol) is an average of two measurements at admission and approximately a week later. p-values have been calculated for Whitney-Mann test. Horizontal whisker bars represent average, SEM and STD, respectively.

QotM Challenge question: What could explain the high plasma levels of β-sitosterol in severe (but not milder cases) of COVID19?

NOTE: We do not have an answer yet – this is a new observation. Thus, this challenge-of-the-month is a genuine, focused research question that is novel. Importantly, the question is not obviously rooted in a preexisting field with its own deep domain-specific knowledge and language. Thus, it should appeal to the inquisitive, informed non-biomedical scientist. This challenge can therefore serve as a test case for improving the Translator by revealing pain-points, unmet requirements, missing knowledge domains, etc. It is expected that in the near future, owing to the increasing embrace of systematic (hypothesis-free, “discovery-driven”) approaches enabled by multi-omics technologies, there will be a flood of such findings in search of explanations.

colleenXu commented 2 years ago

@suihuang-ISB is there a paper or presentation on this finding, that provides more context for the figure + measurements?

colleenXu commented 2 years ago

This shows my thought process through two sets of queries. My commentary on the results is based only on looking at BTE's output, and the bold text shows the main part of the longer queries.

  1. beta-sitosterol -> Disease. https://arax.ncats.io/?r=192f954e-092a-400c-8c8e-4562c8447611 In the results, I found Sitosterolemia interesting. Perhaps the genes and stuff related to this disease are also related to covid.
  2. beta-sitosterol -> sitosterolemia -> NamedThing <- COVID19. https://arax.ncats.io/?r=8b2aa84c-459e-4892-8856-10ea469ce7b5 The results show that sitosterolemia + COVID19 are both related to statins, sterols, APOE, and the liver/intestine.

  1. COVID-19 -> Gene. https://arax.ncats.io/?r=b100c4a2-ec2a-4e0b-ba46-47ad7876f2e3 I was particularly interested in one of the top-scored BTE results, APOE e4, because of its connection to cholesterol. However, I knew that APOE e4 wasn't directly connected to sitosterolemia from earlier queries.
  2. beta-sitosterol -> sitosterolemia -> NamedThing <- APOE e4 <- COVID19. https://arax.ncats.io/?r=b8f969df-50c1-43d0-a586-6af962133574 sitosterolemia + APOE e4 are both connected to APOE (good but also duh). And there wasn't more interesting info...
  3. beta-sitosterol -> sitosterolemia -> Gene (is_set:True) -> NamedThing <- APOE e4 <- COVID19. https://arax.ncats.io/?r=d13bfc30-22c3-4599-8020-9dd3e981a960. A deeper exploration into how sitosterolemia + APOE e4 are connected. Includes neuro-diseases, breast cancer, and genes....not sure how this relates back to covid though. Summarizing the 11 results below:
    1. APOE e4 -> breast cancer <- Genes <- sitosterolemia
    2. APOE e4 -> neurodegenerative disease <- Genes <- sitosterolemia
    3. APOE e4 -> Alzheimer disease <- Genes <- sitosterolemia
    4. APOE e4 -> Alzheimer disease <- Genes <- sitosterolemia (different disease ID)
    5. APOE e4 -> Butyrylcholinesterase <- APOE <- sitosterolemia
    6. APOE e4 -> APOE <- Genes <- sitosterolemia
    7. APOE e4 -> depressive disorder <- Genes <- sitosterolemia
    8. APOE e4 -> dementia <- Genes <- sitosterolemia
    9. APOE e4 -> mental deterioration <- Genes <- sitosterolemia
    10. APOE e4 -> cognitive impairment <- APOE <- sitosterolemia
    11. APOE e4 -> delirium <- APOE <- sitosterolemia
cbizon commented 2 years ago

@colleenXu can you post or send me the trapi used for https://arax.ncats.io/?r=8b2aa84c-459e-4892-8856-10ea469ce7b5 ?

colleenXu commented 2 years ago

@cbizon I can see it in ARAX when I click on query->JSON. EDIT: that's not quite TRAPI, but it's close

TRAPI query: beta-sitosterol -> sitosterolemia -> NamedThing <- COVID19 ``` { "message": { "query_graph": { "edges": { "e00": { "subject": "n0", "object": "n1" }, "e01": { "subject": "n1", "object": "n2" }, "e02": { "subject": "n3", "object": "n2" } }, "nodes": { "n0": { "ids": ["CHEMBL.COMPOUND:CHEMBL221542"], "categories": ["biolink:ChemicalEntity"] }, "n1": { "ids": ["MONDO:0020747", "MONDO:0008863"], "categories": ["biolink:Disease"] }, "n2": { "categories": ["biolink:NamedThing"] }, "n3": { "ids": ["MONDO:0100096"], "categories": ["biolink:Disease"] } } } } } ```
andrewsu commented 2 years ago

Quick note that it looks like the UMLS ID for beta-sitosterol is C0106127. It's not mapped to CHEMBL.COMPOUND:CHEMBL221542 by node normalizer (I'll create an issue momentarily to investigate why), but it might be worth adding that as an additional identifier in queries...

colleenXu commented 2 years ago

Re-ran the first query of the first set, using both the chembl + umls IDs for beta-sitosterol. We get more info from idisk and semmeddb after adding the UMLS ID!

beta-sitosterol -> Disease. https://arax.ncats.io/?r=a9210649-c654-4ab0-9949-b4603ff39c09

colleenXu commented 2 years ago

Also, I must have misremembered. There are results for a beta-sitosterol -> NamedThing <- COVID19 query. https://arax.ncats.io/?r=a36f2d5c-e442-407e-93a1-4e2b432e5d16

colleenXu commented 2 years ago

Also kinda interesting:

COVID -> NamedThing <- sitosterolemia genes (ABCG5, ABCG8) ``` { "message": { "query_graph": { "edges": { "e00": { "subject": "n0", "object": "n1" }, "e01": { "subject": "n2", "object": "n1" } }, "nodes": { "n0": { "ids": ["MONDO:0100096"], "categories": ["biolink:Disease"] }, "n1": { "categories": ["biolink:NamedThing"] }, "n2": { "ids": [ "NCBIGene:64240", "NCBIGene:64241", "NCBIGene:67470", "UniProtKB:Q9H221-1" ], "categories": ["biolink:Gene"], "is_set": true } } } } } ```

https://arax.ncats.io/?r=f3ae2e25-775e-424c-947a-97ea5fcea02f


Other useful starting points for queries besides COVID19?

brettasmi commented 2 years ago

I am providing some additional details on this query to suggest additional pathways for query construction.

First, here is a figure that @suihuang-ISB generated showing TOTAL cholesterol (i.e. no differentiation between HDL and LDL) and B-sitosterol levels.

cholesterol_b-sitosterol

Second, here is the WHO ordinal scale that was used to rank the severity of disease in these patients.

Score Severity Description
0 Healthy Uninfected: No clinical or virological evidence of infection
1 Mild Ambulatory: No limitation of activities
2 Mild Ambulatory: Limitations of activities
3 Moderate Hospitalized, mild disease: no oxygen therapy
4 Moderate Hospitalized, mild disease: oxygen by nasal canula or NRB mask
5 Severe Hospitalized, severe disease: non-invasive ventilation (i.e. CPAP) or high-flow oxygen (i.e. HFNC, Venturi Mask)
6 Severe Hospitalized, severe disease: intubation and mechanical ventilation
7 Severe Hospitalized, severe disease: ventilation & any of the following additional organ support - vasopressors, RRO (dialysis), ECMO

Third, following on the above, it is worth noting that patients who are hospitalized with severe COVID-19 are being treated in part (largely?) for symptom management and not necessarily for the virus itself. In other words, could this phenotype of high B-sitosterol be a byproduct of those disease complications and treatments? Is it possible to pull the complications of severe disease out of Translator?

Given all of the above, we started examining which treatments severely ill patients might receive that could give rise to this observation. I asked my partner, an MD who has treated severly ill COVID-19 patients, and her mind immediately jumped to propofol, a sedative used for patients on ventilation. In some subsequent literature research, we've learned some curious things about propofol (DRUGBANK:DB00818), its side effects, and how it is administered / prepared. Unfortunately, we haven't had time to query against Translator to see if we can discover the same pieces of information via TRAPI ahead of tomorrow's meeting.

For the time being, we'll withhold our exact literature findings in hopes that the group can uncover the same information via Translator. We also hope this inspires some inquiry along the same lines of symptom management, the related treatments, and their preparations; something along the lines of:

(A) Disease: COVID-19 - causes -> (B) Symptom/Complication: ? < -treats (C) ChemicalEntity: ?

Then (C) ChemicalEntity - ? (possibly many hops) ? - B-sitosterol

colleenXu commented 2 years ago

This is my brainstorming based on pubmed searching:

Are the supportive treatments for COVID causing the sitosterol levels?

Asides:

MarkDWilliams commented 2 years ago

Here is the query that I ran to find drugs that "treat" Covid-19. Propofol does show up, though it's ranked fairly low and I don't know that we could count on semmed to catch these kinds of 'related to treatment but not directly treating drugs'. https://arax.ncats.io/?r=b1c8394e-e117-4b3b-8c9e-bd3f97a82ce7

Genomewide commented 2 years ago

[Andy Crouse] - Spoke to an SME about parenteral nutrition and propofol. She said that, at her hospital, people could be on propofol for a month or more if they are unable to be extubated from the vent. They use other drugs to try to limit propofol in some cases because it has to be counted as a fat and is high in calories which complicates balancing nutrition.

Does the EHR KG capture time on a drug or only whether or not they are given the drug? The former sounds much harder to capture, but I am less familiar with that.

karafecho commented 2 years ago

@andy: Yes, time is complicated but not impossible to capture in EHR data. We (Exposures Provider) have implemented several approaches in ICEES (not ICEES KG) to capture the longitudinal nature of both EHR data and environmental exposures. We also have been working with David Borland and David Gotz at UNC to adapt their Cadence visual analytics platform to support longitudinal analysis in ICEES (see this paper). Clinical Data Provider and Multiomics EHR Risk Provider likewise have adopted various approaches to capture the longitudinal nature of EHR data. Unfortunately, space and time are a bit out of scope for Translator, at least for now, although the Clinical Data Committee does have an open ticket to address this issue.

karafecho commented 2 years ago

Following up from last Friday's standup call, and in response to the suggestion to examine real-world clinical evidence related to beta-sitosterol and COVID19, I ran the following ARS queries:

  1. ChemicalEntity (sitosterol, PUBCHEM.COMPOUND:222284) - real_world_evidence_of_association_with - DiseaseOrPhenotypicFeature

    • no results
    • PK=b7291254-53b7-4b2b-bd65-5e9711d53cf8
  2. DiseaseOrPhenotypicFeature (coronavirus infection, MONDO:0005719) - real_world_evidence_of_association_with - ChemicalEntity

    • 500 answers from COHD COVID
    • PK=352bd1b4-2712-4f51-8705-d78acee554bb

A couple of comments:

  1. The fact that Query 1 did not yield any results indicates only that sitosterol was not prescribed/administered to patients in Translator cohorts.
  2. In Query 2, a substitution of MONDO:0100096 (COVID19) for MONDO:0005719 (coronavirus infection) yields zero results.
  3. ICEES COVID does not have a valid TRAPI endpoint, although I can run direct non-TRAPI queries.
  4. I believe @CaseyTa is going to run some additional real-world-evidence queries.
karafecho commented 2 years ago

I ran a progression of exploratory queries intended to examine relationships between beta-sitosterol, coronary heart disease, coronavirus infection/COVID19, and potential intermediary genes. The rational driving the queries is the possibility that beta-sitosterol may reduce risk of heart disease, possibly by reducing cholesterol production. In fact, the US FDA has issued a proposed Authorized Health Claim on plant sterols and coronary heart disease.

  1. A one-hop query ChemicalEntity (beta-sitosterol, PUBCHEM.COMPOUND:222284) - related_to - DiseaseOrPhenotypicFeature established a relationship between beta-sitosterol and coronary heart disease

    • heart disease, coronary heart disease, atherosclerosis, acute coronary syndrome, CVD ("MONDO:0005267","MONDO:0005542","MONDO:0005311","MONDO:0005542","MONDO:0004995")
    • PK = ddaa3051-300e-4e55-ac83-55b48f3994b1
  2. A two-hop query ChemicalEntity (sitosterol, PUBCHEM.COMPOUND:222284) - related_to - DiseaseOrPhenotypicFeature (heart disease, coronary heart disease, atherosclerosis, acute coronary syndrome, CVD, "MONDO:0005267","MONDO:0005542","MONDO:0005311","MONDO:0005542","MONDO:0004995") - related_to - DiseaseOrPhenotypicFeature established a relationship with COVID19

    • COVID19 (MONDO:0100096)
    • PK = 4be34c8d-7a7a-4450-b66b-9cf9533925d9

[Disease as n02 yielded same results]

  1. A 'triangular' four-node query ChemicalEntity (sitosterol, PUBCHEM.COMPOUND:222284) - related_to - DiseaseOrPhenotypicFeature (heart disease, coronary heart disease, atherosclerosis, acute coronary syndrome, CVD, "MONDO:0005267","MONDO:0005542","MONDO:0005311","MONDO:0005542","MONDO:0004995") - related_to - DiseaseOrPhenotypicFeature (COVID19, MONDO:0100096), with Gene related_to both ChemicalEntity and DiseaseOrPhenotypicFeature (COVID19, MONDO:0100096) yielded 24 results from ARAGORN.
    • Results were varied, and some were a bit odd (titanium, hematite), but others may (or may not) be interesting (vehicle emissions, dermatophagoides pteronyssinus antigen p 1 [dust mite], STAT1)
    • Also worth noting is that some of the results appeared to primarily reflect the relationship between coronary heart disease and COVID, not beta-sitosterol and COVID
    • PK = 6c9c1e01-fd7b-4b0c-8398-249f773a428b

Kudos to @maximusunc for debugging the identifier issues in the third query!

brettasmi commented 2 years ago

@suihuang-ISB has produced the following check-in at the midpoint of this QOTM:

This is a midway assessment of our progress of the Challenge-of-the-Month. One motivation to present the explanation of the observed association between plasma b-sitosterol and severe COVID-19 as a challenge was that this case does not require “deep” medical knowledge (pathophysiology, clinical pharmacology, anatomy). Instead, it could be solved by the informed non-MD, such as investigative journalists, detectives, etc. The key aspect is that solving the challenge is expected to require logical methodical thinking and connecting the dots by human mind not likely to be provided by the Translator. But at least, we can asked if as minimal functionality the Translator is able to fill gaps in specific knowledge needed to make the logics work.

Indeed, from the last stand-up it was observed that this challenge cannot be solved using the current Translator alone, but that it had to be augmented by conventional web search (Google/Pubmed). Human intuition and logical reasoning is required to break apart this challenge into elementary questions first that then could lead to queries amenable for the Translator. Below is a summary of our approach and that of other teams (as reported in the stand-ups) so far – leading to a new specific hypothesis.

  1. WHAT IS b-sitosterol? ➡ The answer was quickly found: it is an exogenous compound, made by most plants and not by human metabolism. This answer came from web searches, although b-sitosterol is listed in ChEMBL or DrugBank. The Translator does not answer “WHAT IS...”-questions, so we had to resort to e.g. Wikipedia.

  2. WHY is sitosterol in the human body in the first place, and why only found in patients with severe COVID-19? ➡ Being an exogenous compound, it must logically have been either (i) self-administered or (ii) administered during care in the hospital. Then, connection with COVID-19 must be explained

    1. Based on web searches suggesting that b-sitosterol has been attributed immune-stimulatory effects (not verified) and is widely available as natural dietary supplement, it is tempting to assume that b-sitosterol has been taken by patients suffering from progressing COVID19 symptoms as self-medication. However, this possibility cannot be confirmed without an empirical study.
    2. b-Sitosterol was given to severely ill COVID-19 pateints in the hospital. But why? There is no indication that b-sitosterol is a recommended therapy in U.S. hospitals.
  3. Several teams uncovered associations between sitosterol and COVID-19 independent of the above logics, mostly by “brute force” queries. Most of the associations found in the Translator came from SemMedDB. A recurring finding was the Mendelian disease sitosterolemia (caused by mutations that increase sitosterol absorption). Associated pathology connected b-sitosterol to cholesterol and ApoE (among others). The former prompted an analysis of the metabolomics data of pateints of the same cohort but showed no correlation between plasma b-sitosterol and cholesterol levels. Cholesterol led to the finding that some ApoE alleles have been associated with risk for severe COVID-19.

  4. Pushing for any edge that can provide the desired connection, again mostly relying to SemMedDB/Textmining results, led to a molecular connection: sitosterol can interfere with the being of the SARS-Cov2 virus Spike protein to the cellular receptor ACE2.

  5. The above results (3) and (4) do not consider the piece of information that the sitosterol – COVID-19 connection was strictly observed only for severe cases. And as argued in (2ii), severe COVID-19 is not an indication for b-sitosterol supplement.

  6. Therefore, if b-sitosterol was not administered intentionally based on an explicit indication, we need to explore “UNINTENDED” administration as byproduct of a treatment only given to patients with severe COVID-19 . For this purpose, these two types of knowledge sources/KP would have been helpful:

    1. A clinical / EHR KP that offers access to COVID19 patient records documenting severity and the drug treatments that they received.
    2. A drug database that lists the inactive ingredients of drugs. Currently, the Translator does not systematically offer solid information on excipients (inactive ingredients) of drugs – although individual inactive ingredients are covered in compound databases.

Thus, the next steps were done manually:

  1. Instead of relying on clinical data KP, (6i) Brett Smith's knowledge of the WHO COVID-19 ordinal scale led him to investigate the characteristics specific to severely ill patients. One characteristic of severe grades of COVID-19 is breathing support, specifically mechanical ventilation in the most severe cases, therefore it was likely that all patients with increased b-sitosterol had been on breathing support. This was later confirmed by examination of the patient metadata.

  2. The next step required this INDEPENDENT PIECE OF KNOWLEDGE: Mechanical ventilation requires intubation, which is done under sedation. The commonly used drug for such fast induction of sedation is PROPOFOL. Thus, this drug is the most apparent common denominator of all the severe COVID-19 cases (proposed by Brett, after consultation with a clinician).

  3. This triggers the next question: How does Propofol cause an increase of b-sitosterol? Logically, we have these two possibilities (almost a collectively exhaustive set):

    1. The propofol preparation (i.v. injection) contains as inactive ingredient plant extracts, and with it b-sitosterol
    2. Propofol (or its metabolites) interacts with the resorption or metabolism of b-sitosterol such as to lead to b-sitosterol accumulation (similar to the genetic disease sitosterolemia)

➡️➡️ A literature search supports (9i), however, we are still evaluating (9ii).

The scheme below (by Brett Smith) summarizes the hypothesis: A simple linear scheme that in principle a Translator could have been able to provide. However, in its current state, logical reasoning by the user and examination of the original clinical patient data for more hints was required.

image

The nodes and edges in red indicate types currently not readily available in the Translator. Propofol (blue) was the critical hint needed – provided by human input.

colleenXu commented 2 years ago

@brettasmi @suihuang-ISB Wait...were the blood measurements made before interventions like propofol/parenteral-nutrition?

The original observation mentions "Each COVID19 data-point (symbol) is an average of two measurements at admission and approximately a week later." What does "admission" mean? What is the timing of the measurements compared to the course of symptoms or course of medical interventions?

suihuang-ISB commented 2 years ago

Hi Colleen - very good points! Thanks for pointing to these issues

(1) I summarized the two measurements for each patients to one point because the spread was in general small, and I did not want to give undue weight to the COVID pateints compared to the healthy controls where each individual had only 1 time point.

(2) With “Admission” I meant: hospital admission - typically a first blood draw is taken close to the entry exam. BUT: I learned from Brett that “the first blood” draw may have referred to “at ENROLLMENT” in the study (takes days for consent), which may be days after admission, and some patients may already be on the ventilator - or sedated. Brett is trying to get back into the study patient metadata to confirm the exact time point of blood draws relative to intubation - if recorded at all by the study staff.

Sorry these are the usual complications in such hastily designed study for an emergency situation....

Hope that helps.

From: Colleen Xu @.> Sent: Tuesday, April 12, 2022 8:28 PM To: NCATSTranslator/testing @.> Cc: suihuang-ISB @.>; Mention < @.> Subject: Re: [NCATSTranslator/testing] Question of the Month #2: β-sitosterol and Covid 19 (Issue #185)

@brettasmi https://github.com/brettasmi @suihuang-ISB https://github.com/suihuang-ISB Wait...were the blood measurements made before interventions like propofol/parenteral-nutrition?

The original observation mentions "Each COVID19 data-point (symbol) is an average of two measurements at admission and approximately a week later." What does "admission" mean? What is the timing of the measurements compared to the course of symptoms or course of medical interventions?

— Reply to this email directly, view it on GitHub https://github.com/NCATSTranslator/testing/issues/185#issuecomment-1097515414, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQCQM7F7GUYM3GZNRERQZJLVEY5KRANCNFSM5SCM67SA . You are receiving this because you were mentioned.Message ID: < @.***>

colleenXu commented 2 years ago

As I've noted during the meetings, Translator doesn't have much support / name-retrieval for Procedures. However, mechanical ventilation + parenteral nutrition are procedures used to manage symptoms of COVID19.

We do have Procedures from semmeddb + potentially from clinical KP (multiomics clinical risk kp api?)

CaseyTa commented 2 years ago

COHD-COVID can get us a little closer to linking severe COVID to Propofol, but there are a few caveats.

1) COHD-COVID is similar to the more commonly used COHD in that it supplies associations between pairs of contexts. The difference though is that when querying COHD-COVID, all returned associations were found within an implied context of hospitalized COVID-19 patients. At this stage, neither the input TRAPI query to COHD-COVID nor the output TRAPI response are modeling this COVID-19 context. Because of this, we're mostly leaving COHD-COVID out of Translator until the cohort/context modeling discussions are finalized, so the ARAs are not calling COHD-COVID, though the TRAPI endpoint is functional.

2) COHD-COVID does not stratify or have specific labels based on COVID severity. However, we can use a proxy for COVID severity by querying for associations with things like mechanical ventilation, intubation, acute respiratory distress syndrome (ARDS), or acute hypoxemic respiratory failure (AHRF). Unfortunately, we're not finding any observations of mechanical ventilation from COHD-COVID (very strange, I'm not sure why yet), but we are finding associations from intubation, ARDS, and AHRF with propofol.

Query with intubation (can use UMLS:C4039867 for AHRF or MONDO:0100130 for ARDS):

{
        "message": {
            "query_graph": {
                "nodes": {
                    "n0": {
                        "ids": ["SNOMEDCT:52765003"],
                        "categories": [
                            "biolink:Procedure"
                        ],
                        "name": "Intubation"
                    },
                    "n1": {
                        "categories": [
                            "biolink:MolecularEntity"
                        ]
                    }
                },
                "edges": {
                    "e0": {
                        "subject": "n0",
                        "object": "n1",
                        "predicates": ["biolink:has_real_world_evidence_of_association_with"]
                    }
                }
            }
        }
    }

COHD-COVID TRAPI responses (can be viewed in ARAX UI via Import Response): Acute hypoxemic respiratory failure Acute respiratory distress syndrome Intubation

colleenXu commented 2 years ago

This query thru BTE connects covid to sitosterol / phytosterol through propofol + soybean oil / fat emulsion, as the TOP result. There are 32 results total. This takes between 2-3 min to run through BTE.

https://arax.ncats.io/?r=8e8037ef-222b-4913-9c6a-3ebd7721b55d

I used the following method:

  1. Search terms are likely too specific. I used UMLS's search function and concept pages to find a. more general terms for beta-sitosterol (C0106127). This led me to the sitosterols + phytosterol terms. b. more general terms for soybean oil (C0037732). This led me to some oil / lipid / fat related terms. c. more specific terms for COVID infection (C5203670) since the focus is on acute / severe presentation. d. Note that there is a "propofol lipid emulsion" term in UMLS, but I wasn't able to find it connected to anything in semmeddb...
  2. I built up from both ends: so the sitosterol/phytosterol terms -> oil and covid19 -> propofol. then checking if oil - propofol.

The issue is that this set of links would be buried in a less constrained query like sitosterol/phytosterol -> NamedThing -> NamedThing <-(treated_by)- COVID19. And a less constrained query would take a long time to run.

The query ``` { "message": { "query_graph": { "edges": { "e00": { "subject": "n0", "object": "n1" }, "e01": { "subject": "n1", "object": "n2" }, "e02": { "subject": "n3", "object": "n2", "predicates": ["biolink:treated_by"] } }, "nodes": { "n0": { "ids": [ "CHEMBL.COMPOUND:CHEMBL221542", "UMLS:C0106127", "UMLS:C0037215", "UMLS:C2349081", "UMLS:C0031866" ], "categories": ["biolink:ChemicalEntity"], "is_set": true, "name": "sitosterols and phytosterols" }, "n1": { "ids": [ "UMLS:C0037732", "UMLS:C0042438", "UMLS:C3883356", "UMLS:C0304483" ], "categories": ["biolink:ChemicalEntity", "biolink:Food"], "is_set": true, "name": "oil, lipid, fat terms" }, "n2": { "categories": ["biolink:NamedThing"] }, "n3": { "ids": [ "MONDO:0100096", "UMLS:C5439524", "UMLS:C5540804" ], "categories": ["biolink:Disease"], "is_set": true, "name": "covid and acute specifically" } } } } } ```
colleenXu commented 2 years ago

However, I'm finding it hard to connect COVID19 directly with terms related to mechanical ventilation or parenteral nutrition (through BTE)

However, I can connect COVID19 to mechanical ventilation by going through "COVID related diseases" like respiratory failure and ARDS (through BTE; 196 results total)

https://arax.ncats.io/?r=52b0043d-8093-4395-b292-56007af4a114

mechanical vent terms -> NamedThing <- COVID19 terms ``` { "message": { "query_graph": { "edges": { "e00": { "subject": "n0", "object": "n1" }, "e01": { "subject": "n2", "object": "n1" } }, "nodes": { "n0": { "ids": ["UMLS:C0554804", "UMLS:C0199470"], "categories": ["biolink:Procedure"], "is_set": true, "name": "vent" }, "n1": { "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n2": { "ids": [ "MONDO:0100096", "UMLS:C5439524", "UMLS:C5540804" ], "categories": ["biolink:Disease"], "is_set": true, "name": "covid and acute specifically" } } } } } ```

EDIT

mechanical ventilation is a tricky concept because there is a "procedure" of administering/receiving it - with lots of different IDs in UMLS (invasive? positive-pressure?). There's also the related concept of intubation and the medical devices (ventilators) themselves...

I only picked two ventilation procedure IDs to use above.

colleenXu commented 2 years ago

Also, I can connect COVID19 to phytosterols by going through parenteral nutrition + "COVID related diseases" like respiratory failure / ARDS (through BTE). 33 results total in BTE

https://arax.ncats.io/?r=30c372fe-5a6f-4c06-a518-5b1ee5a8a914

the query ``` { "message": { "query_graph": { "edges": { "e00": { "subject": "n0", "object": "n1" }, "e01": { "subject": "n1", "object": "n2" }, "e02": { "subject": "n3", "object": "n2" } }, "nodes": { "n0": { "ids": [ "CHEMBL.COMPOUND:CHEMBL221542", "UMLS:C0106127", "UMLS:C0037215", "UMLS:C2349081", "UMLS:C0031866" ], "categories": ["biolink:ChemicalEntity"], "is_set": true, "name": "sitosterols and phytosterols" }, "n1": { "ids": [ "UMLS:C0015667", "UMLS:C0717967", "UMLS:C2936298", "UMLS:C3883356"], "categories": ["biolink:ChemicalEntity", "biolink:Food"], "name": "Parenteral nutrition", "is_set": true }, "n2": { "categories": ["biolink:DiseaseOrPhenotypicFeature"] }, "n3": { "ids": [ "MONDO:0100096", "UMLS:C5439524", "UMLS:C5540804" ], "categories": ["biolink:Disease"], "is_set": true, "name": "covid and acute specifically" } } } } } ```

Of course, the same issue as above applies: this set of links would be buried in a less constrained query like sitosterol/phytosterol -> NamedThing -> NamedThing <- COVID19. And a less constrained query would take a long time to run.

EDIT: Parenteral nutrition is a tricky concept because there is a "procedure" of administering/receiving it, and there is a "parenteral nutrition solution" food / chemical mix / drug that is given. I could only find the connection to phytosterols to the food/chem/drug mixture.....not to the procedure.

colleenXu commented 2 years ago

Note that lots of UMLS IDs don't have human-readable labels / ID cross-mappings. BTE uses SRI Node Normalizer for these tasks.

brettasmi commented 2 years ago

This week, I was able to track down a knowledge source that contains the information for one of our missing links in this QOTM: propofol's association with soybean oil. I found it in DailyMed, the NIH's drug label database that contains structured product labels for drugs. The structured product label contains an "Ingredients and Appearance" section that has information on active and inactive ingredients. An example propofol record can be found here, and the full set of results here. You can find the soybean oil by clicking on the "Ingredients and Appearance" section on one of the propofol emulsion pages.

Note that explaining the connection this way would change the above diagram a bit. Instead of a direct connection between propofol and soybean oil, we would see an intermediate node for a label / preparation / specific drug entry. Said node would have edges meaning "contains" to both propofol and soybean oil.

With DailyMed available, I explored some of Casey's COHD-COVID results that he shared last week. First I filtered the drugs to those in the class of anaesthetics using SPOKE. Excluding Propofol, I was left with the following:

I used DailyMed to examine various preparations of the three drugs above. I didn't check exhaustively, but for the most part, they seemed to be prepared with only propylene glycol or saline. Of course, all of the other drugs that COHD found to be associated with ventilation-related conditions or procedures remain candidates for search as well.

Finally, we are still waiting on the results from the EHR to see if the specific patients who exhibited the elevated levels b-sitosterol were administed propofol before their blood draw. Hopefully we'll have that information before long, and we'll be able to determine if the above hypothesis has any support of if we should go back to the drawing board to explain this observation.

A note on DailyMed: it appears to be downloadable and there are cross references available to RxNorm and potentially others. I didn't have time to do a deep dive on the structure of the downloads, but it seems like something that could be exposed as a KP with some work. I also discovered this section of an FDA page on drug labels, which has a number of other resources that could potentially contain this information.

colleenXu commented 2 years ago

Here's a belated reply to an email from @suihuang-ISB (I'll tag @brettasmi here too):

In response to this post, Sui wrote:

We are particularly interested what part was required some ‘prompting’ by what you have heard, and what part of the answer could have found, automatically when you had learned nothing from the discussions. Such demarcation is important.

For the post in question and the related posts (propofol, parenteral nutrition), I think the query structures were strongly influenced by the clues given during discussion. Perhaps these clues could have been gleaned from reading outside sources (I think an informed person would know that supportive treatment for severe, acute COVID involves sedation, ventilation, and some kind of nutrition support). But I did find it difficult to tell if "parenteral nutrition" was done....

However, without constraining the queries to the topics of "ventilation" or "propofol" or "parenteral nutrtion", I think we hit a problem of inflated/under-constrained queries. I think a less constrained version of the queries would be:

colleenXu commented 2 years ago

And related issues: I'll reiterate that

karafecho commented 11 months ago

Closing with comment in #233 ...