NCATSTranslator / minihackathons

MIT License
5 stars 5 forks source link

Workflow B discussion fka B.1_DILI-three-hop-from-disease-or-phenotypic-feature.trapi #43

Open karafecho opened 3 years ago

karafecho commented 3 years ago

B.1_DILI-three-hop-from-disease-or-phenotypic-feature.trapi

This query runs from CURIE to DiseaseOrPhenotypicFeature to Gene to ChemicalSubstance.

For reference, here is the workflow mural board, pre-relay meeting materials, and relay meeting materials.

Here is a related Workflow B mini-hackathon notebook.

Example 3-hop TRAPI 1.1 query, original workflow, drug-induced liver injury (MONDO:0005359) as input CURIE.

Additional DiseaseOrPhenotypicFeature names (identifiers): toxic liver disease with acute hepatitis (SNOMEDCT:197358007), chronic DILI (MESH:D056487), hospitalization (MESH:D006760), transplanted liver complication (NCIT:C26991).

Note that this query should return results from ARAX if condition_associated_with_gene is flipped to gene_associated_with_condition, which is the Biolink canonical direction

 {
   "nodes": {
    "n0": {
       "name": "drug-induced liver injury",
       "ids": ["MONDO:0005359"]
     },
     "n1": {
       "categories": [
         "biolink:DiseaseOrPhenotypicFeature"
       ],
       "name": "Disease Or Phenotypic Feature"
     },
     "n2": {
       "categories": [
         "biolink:Gene"
       ],
       "name": "Gene"
     },
     "n3": {
       "categories": [
         "biolink:ChemicalSubstance"
      ],
       "name": "Chemical Substance"
     }
   },
   "edges": {
     "e0": {
       "subject": "n0",
       "object": "n1",
       "predicates": [
         "biolink:correlated_with"
       ]
     },
     "e1": {
       "subject": "n1",
       "object": "n2",
       "predicates": [
         "biolink:condition_associated_with_gene"
       ]
     },
     "e2": {
       "subject": "n2",
       "object": "n3",
       "predicates": [
         "biolink:related_to"
       ]
     }
   }
 }
karafecho commented 3 years ago

MONDO:0005359->DiseaseOrPhenotypicFeature->Gene->Chemical Substance

ARAX DSL query and results (David K.): https://arax.ncats.io/beta/?r=12696

ARAX DSL query and results, with operations (David K.): https://arax.ncats.io/beta/?r=12700

karafecho commented 3 years ago

Python query (Hao X.): ICEES+ DILI API ["MONDO:0005359 (DILI)", "SNOMEDCT:197358007" (toxic liver disease with acute hepatitis), "MESH:D056487" (chronic DILI), "MESH:D006760" (hospitalization), "NCIT:C26991" (transplanted liver complication)]->DiseaseOrPhenotypicFeature->ARAX TRAPI 1.1->DiseaseOrPhenotypicFeature->Gene->ChemicalSubstance

Python results

results.xlsx

karafecho commented 3 years ago

ARS Queries and PKs (Chris B.):

MONDO:0005359 (DILI)->DiseaseOrPhenotypicFeature->Gene->ChemicalSubstance

PK=28390bb4-3831-4cb7-b7bf-df08a80f2585

PK=0445a10f-05b6-4bf3-82a5-586a55655e1c

karafecho commented 3 years ago

Pre-relay output from step 3 via manual hand-offs: Molecular Provider TRAPI 1.0 results (Vlado D.): https://github.com/ranking-agent/robogallery/blob/master/relay_spring_2021/DILI/Enrichment-DILI_step3-molpro.ipynb

karafecho commented 3 years ago

Mini-hackathon recap and summary, 06.24.21 (also see related tickets)

This workflow has two major goals: (1) suggest drugs for repurposing, along with evidence of biological plausibility -> DILIN clinical trial (steroids? prednisone?); and (2) suggest chemicals for liver-on-a-chip screening (or other in vitro screening assays), along with evidence of biological plausibility -> NCATS, NIEHS. To promote our success, we are planning a two-pronged approach for execution of the workflow, one that supports multiple entry points and leverages knowledge from multiple providers, as depicted in the mural board.

Brief Summary: Direct three-hop ARAX DSL and Python queries are demonstrating success and generating reasonable results; however, three-hop ARS queries have been less successful, primarily (I think) due to the fact that ARAs are not calling ICEES and are submitting calls to COHD that are likely not reaching the correct endpoint. ARS one-hop queries have been more successful.

Direct ARAX DSL three-hop queries

three-hop MONDO:0005359 (DILI) https://arax.ncats.io/beta/?r=12696 three-hop MONDO:0005359, with operations: https://arax.ncats.io/beta/?r=12700

Direct Python three-hop queries of ICEES DILI + ARAX

three-hop MESH:D006760 (hospitalization) Python results, results.xlsx

Also works when using MONDO:0005359 (DILI), MESH:D056487 (chronic DILI), and NCIT:C26991 (transplanted liver complication)

Direct ICEES DILI API one-hop queries

one-hop MONDO:0005359 curl -XPOST https://icees.renci.org:16341/query -H "Content-Type: application/json" -d '{"message": {"query_graph": {"nodes": {"n0": {"name": "drug-induced liver injury", "ids": ["MONDO:0005359"]}, "n1": {"categories": ["biolink:DiseaseOrPhenotypicFeature"], "name": "Disease Or Phenotypic Feature"}}, "edges": {"e0": {"subject": "n0", "object": "n1", "predicates": ["biolink:correlated_with"]}}}}}'

MONDO0005359 first-hop output.txt

Also works when using MONDO:D006760 (DILI), MESH:D056487 (chronic DILI), and NCIT:C26991 (transplanted liver complication)

ARS queries from June 3

three-hop MONDO:0005359 https://arax.ncats.io/?source=ARS&id=0445a10f-05b6-4bf3-82a5-586a55655e1c - errors and no results one-hop MONDO:0005359 https://arax.ncats.io/?source=ARS&id=e5b5ecaf-1a56-44c3-af22-b3325d8f97ac - weird results one-hop SNOMED:197358007 (toxic liver disease with acute hepatitis) https://arax.ncats.io/?source=ARS&id=12cf506e-bb4b-425e-851e-7e91bc90e489 one-hop MONDO:0019542 https://arax.ncats.io/?source=ARS&id=973fbf59-851d-4fd1-947c-bbdd743b2306 one-hop NCBIGene:60412 https://arax.ncats.io/?source=ARS&id=c53b5876-93cb-4fb0-bd88-5011a7b44ca5

ARS queries from June 24

three-hop's returning errors and no results with MONDO:0005359, even with canonical biolink:gene_associated_with_condition, likely due to ARAs not calling the ICEES+ DILI API https://arax.ncats.io/?source=ARS&id=8432eb9d-783c-4a0e-9343-298f67cac6bb https://arax.ncats.io/?source=ARS&id=0445a10f-05b6-4bf3-82a5-586a55655e1c https://arax.ncats.io/?source=ARS&id=60085c3e-3ad4-411e-9c7f-a279c8704e5d

one-hop from MONDO:0005359 https://arax.ncats.io/?source=ARS&id=0cdc5dab-49ba-4a3d-9688-8f36f95b964d - 9 results from ARAX, but seem off

two-hop from MONDO:0019542 - errors and 0 answers https://arax.ncats.io/?source=ARS&id=9f462d69-ddd0-4dbc-a4ac-5c6cb0ce0b03 https://arax.ncats.io/?source=ARS&id=973fbf59-851d-4fd1-947c-bbdd743b2306

one-hop from NCBIGene:60412 https://arax.ncats.io/?source=ARS&id=e5b5ecaf-1a56-44c3-af22-b3325d8f97ac - 2 results from ARAX, seem reasonable

CaseyTa commented 3 years ago

@vgardner-renci Is this discussion for Workflow B being moved somewhere else? This "issue" was one of those Github issues that aren't really an issue, but actually used for capturing our discussion.

vgardner-renci commented 3 years ago

Apologies. I have reopened the Workflow B issues.

karafecho commented 3 years ago

@vgardner-renci : I think my chat message wasn't clear. I think we can close issues B.2-B.6, as these are just TRAPI queries and are now being captures in /2021-12_demo, but B.1 contains notes and discussion points, as Casey pointed out.