NCATSTranslator / minihackathons

MIT License
5 stars 5 forks source link

Workflow C: Clinical real-world evidence: current drugs and potential new insights #34

Closed rtroper closed 3 years ago

rtroper commented 3 years ago

Overall Workflow

See this google drive folder (specifically, these google slides)

Note: Overlay edges can provide research evidence (links to published papers) for final results

image

December Demo Narrative

The following provides a high-level narrative for walking through workflow C in the December demo. Note that this should be seen as a template narrative in which the specific disease example (in this case, multiple sclerosis) could be replaced by another disease of interest. By the time of the December demo, we should have one or two solid disease examples in which the narrative outlined below highlights some compelling results.

Example narrative using multiple sclerosis as the disease of interest

For a disease of interest we would first like to identify drugs that are in some way associated with the disease, based on data from a clinical care setting. These could be drugs (1) used to directly treat the disease, (2) used to treat symptoms or comorbidities typically associated with the disease, or possibly (3) drugs used for treating another condition, but which have the side effect of exacerbating the disease of interest.

As an example, let's say we're interested in drugs associated with multiple sclerosis. We could query Translator for this information as follows:

[Run query C.1 (or view results previously obtained) using multiple scelerosis as the disease of interest]

Reviewing results, we see several drugs commonly used in the treatment of multiple sclerosis:

[Point out examples of drugs used to treat the disease e.g. ocrelizumab, natalizumab]

We also see drugs that are used to treat symptoms or secondary conditions commonly seen in multiple sclerosis patients:

[Point out examples of drugs used to treat symptoms, such as pain or muscle spasticity e.g. baclofen, tizanidine, gabapentin, oxybutynin]

A clinician reviewing these results will easily recognize these expected drugs. Other drugs may be more surprising. In some cases, these may be drugs that are used off-label for the disease or have been in clinical trials for repurposing.

[Point out examples e.g. imatinib]

What if Translator could (1) provide information about underlying pathways of action for such drugs and (2) identify an expanded list of drugs that may operate via similar pathways. We can run such a query.

[Run query C.2 (or view results previously obtained) using imatinib, or another interesting drug]

Here is a set of genes (or pathways) that are associated with both multiple sclerosis and the drug(s) that we selected from the first query. And here is a group of drugs that are associated with those genes/pathways. Inspecting the drugs, we see some interesting results.

[Highlight some of the drugs in the final group that are interesting (might also highlight some of the genes/pathways)]

Translator also has an "overlay" feature that can provide additional "provenance" or source information, such as references to published studies reporting evidence of associations between these drugs and the disease of interest. For example, here's a "research evidence" edge connecting the drug <_interesting drug from final results_> to multiple sclerosis. If we click on that link, we get a list of PubMed IDs to papers that we can look at to understand the basis of these associations.

[Click on a few PMIDs and highlight some of the published research results]

Emphasize that we could follow a similar investigative/exploratory workflow with other diseases besides multiple sclerosis.

[Possibly briefly highlight interesting results already generated for another disease]

Example Using Multiple Sclerosis as Disease X

Query 1: SmallMolecule related_to MONDO:0005301

The TRAPI query, immediately below, returns drugs related to multiple sclerosis. Using the ARAX DSL, it is possible to target specific KPs (for example clinical KPs) for more domain-specific knowledge (see DSL Query A, below). This can be further refined by querying a single, specific KP and sorting results based on an edge property (see DSL Query B, below).

TRAPI 1.1 Query Submitted to: https://ars-dev.transltr.io/ars/api/submit ARS pk: 9434d84a-0977-41db-8fd4-94292b866b97 ARAX results: https://arax.ncats.io/?r=12169

{
    "message": {
        "query_graph": {
            "edges": {
                "e00": {
                    "subject": "n00",
                    "object": "n01",
                    "predicates": ["biolink:related_to"]
                }
            },
            "nodes": {
                "n00": {
                    "categories": ["biolink:SmallMolecule"]
                },
                "n01": {
                    "categories": ["biolink:Disease"],
                    "ids": ["MONDO:0005301"]
                }
            }
        }
    }
}

ARAX DSL Query A The following query specifically targets COHD and Clinical Risk KPs. Results: https://arax.ncats.io/?r=12173.

add_qnode(ids=MONDO:0005301, key=n00)
add_qnode(categories=biolink:SmallMolecule, key=n01)
add_qedge(subject=n01, object=n00, key=e00, predicates=biolink:related_to)
expand(kp=COHD,edge_key=e00,COHD_method=paired_concept_freq)
expand(kp=ClinicalRiskKP, edge_key=e00)
overlay(action=compute_ngd, virtual_relation_label=N1, subject_qnode_key=n01, object_qnode_key=n00)
resultify()

ARAX DSL Query B The following query specifically targets the Clinical Risk KP, sorting by the feature_coefficient edge property, and limiting to only the top 30 results. Results: https://arax.ncats.io/?r=12174.

add_qnode(ids=MONDO:0005301, key=n00)
add_qnode(categories=biolink:SmallMolecule, key=n01)
add_qedge(subject=n01, object=n00, key=e00, predicates=biolink:related_to)
expand(kp=ClinicalRiskKP, edge_key=e00)
overlay(action=compute_ngd, virtual_relation_label=N1, subject_qnode_key=n01, object_qnode_key=n00)
resultify()
filter_results(action=sort_by_edge_attribute, edge_attribute=feature_coefficient, direction=descending, max_results=30, prune_kg=true)

Query 2: SmallMolecules related_to MS/imatinib Genes

The TRAPI query, immediately below, is meant to return drugs related to a set of genes that are related to both (1) multiple sclerosis and (2) a single SME-selected drug (from ARAX DSL Query B results, above), imatinib. Without operations and other advanced features yet supported by TRAPI, ARS, and some ARAs, the query is under-constrained and does not yield results. However, taking advantage of such features supported by ARAX, results are obtainable using an ARAX DSL query (see DSL Query, below).

TRAPI 1.1 Query Submitted to: https://ars-dev.transltr.io/ars/api/submit ARS pk: ec03a9bb-af05-4bf3-a57d-f196c02c858b

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Gene"]
                },
                "n1": {
                    "ids": ["CHEBI:45783"],
                    "categories": ["biolink:SmallMolecule"]
                },
                "n2": {
                    "ids": ["MONDO:0005301"],
                    "categories": ["biolink:Disease"]
                },
                "n3": {
                    "categories": ["biolink:SmallMolecule"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:related_to"]
                },
                "e02": {
                    "subject": "n0",
                    "object": "n2",
                    "predicates": ["biolink:related_to"]
                },
                "e03": {
                    "subject": "n0",
                    "object": "n3",
                    "predicates": ["biolink:related_to"]
                }
            }
        }
    }
}

ARAX DSL Query The following query takes advantage of operations and other advanced features currently supported by ARAX. Results: https://arax.ncats.io/?r=12175.

add_qnode(key=n0, categories=biolink:Gene, is_set=True)
add_qnode(key=n1, ids=CHEBI:45783, categories=biolink:SmallMolecule)
add_qedge(key=e01, subject=n0, object=n1, predicates=biolink:related_to)
add_qnode(key=n2, ids=MONDO:0005301, categories=biolink:Disease)
add_qedge(key=e02, subject=n0, object=n2, predicates=biolink:related_to)
add_qnode(key=n3, categories=biolink:SmallMolecule)
add_qedge(key=e03, subject=n0, object=n3, predicates=biolink:related_to)
expand()
overlay(action=compute_jaccard,start_node_key=n1,intermediate_node_key=n0,end_node_key=n3,virtual_relation_label=J1)
overlay(action=compute_jaccard,start_node_key=n2,intermediate_node_key=n0,end_node_key=n3,virtual_relation_label=J2)
overlay(action=compute_ngd,default_value=inf,virtual_relation_label=N1,subject_qnode_key=n3,object_qnode_key=n2)
overlay(action=compute_ngd,default_value=inf,virtual_relation_label=N2,subject_qnode_key=n3,object_qnode_key=n1)
resultify()
filter_results(action=limit_number_of_results, max_results=50, prune_kg=true)

Query 2 Other Results

The following is a list of drugs from Query 1. Links, below, are to Query 2 results using the respective drug. Also included for each is the number of genes (common to the drug and to multiple sclerosis) in the knowledge graph.

rtroper commented 3 years ago

Ehlers-Danlos (MONDO:0020066) Query 1

TRAPI 1.1 ARS pk: 00699bff-e963-4155-ad49-1bb0ae148c86 ARAX results: https://arax.ncats.io/?r=12470

DSL Query A https://arax.ncats.io/?r=12473

DSL Query B https://arax.ncats.io/?r=12474

Query 2 Results

The following is a list of drugs from Query 1. Links, below, are to Query 2 results using the respective drug. Also included for each is the number of genes (common to the drug and to Ehlers Danlos) in the knowledge graph.

rtroper commented 3 years ago

Guillain-Barre (MONDO:0016218) Query 1

TRAPI 1.1 ARS pk: c2faf4d2-3578-4509-8009-cbce77feeb03 ARAX results: https://arax.ncats.io/?r=12475

DSL Query A https://arax.ncats.io/?r=12476

DSL Query B https://arax.ncats.io/?r=12477

Query 2 Results

The following is a list of drugs from Query 1. Links, below, are to Query 2 results using the respective drug. Also included for each is the number of genes (common to the drug and to Guillain-Barre Syndrome) in the knowledge graph.

rtroper commented 3 years ago

Meniere's Disease (MONDO:0007972) Query 1

TRAPI 1.1 ARS pk: d5ecbe7d-511b-4e8a-9153-952cae0e8c88 ARAX results: https://arax.ncats.io/?r=12478

DSL Query A https://arax.ncats.io/?r=12479

DSL Query B https://arax.ncats.io/?r=12480

Query 2 Results

The following is a list of drugs from Query 1. Links, below, are to Query 2 results using the respective drug. Also included for each is the number of genes (common to the drug and to Meniere's disease) in the knowledge graph.

rtroper commented 3 years ago

Alzheimer's Disease (MONDO:0004975) Query 1

TRAPI 1.1 ARS pk: a5b53f29-90ed-4af9-b28c-b6f1d5ee3179 ARAX results: https://arax.ncats.io/?r=12481

DSL Query A https://arax.ncats.io/?r=12482

DSL Query B https://arax.ncats.io/?r=12483

rtroper commented 3 years ago

Ehlers-Danlos (MONDO:0020066) Query 2

Results using gatifloxacin (CHEMBL.COMPOUND:CHEMBL31): https://arax.ncats.io/?r=14810

add_qnode(key=n0, categories=biolink:Gene, is_set=True)
add_qnode(key=n1, ids=CHEMBL.COMPOUND:CHEMBL31, categories=biolink:SmallMolecule)
add_qedge(key=e01, subject=n0, object=n1, predicates=biolink:related_to)
add_qnode(key=n2, ids=MONDO:0020066, categories=biolink:Disease)
add_qedge(key=e02, subject=n0, object=n2, predicates=biolink:related_to)
add_qnode(key=n3, categories=biolink:SmallMolecule)
add_qedge(key=e03, subject=n0, object=n3, predicates=biolink:related_to)
expand()
overlay(action=compute_ngd,default_value=inf,virtual_relation_label=N1,subject_qnode_key=n3,object_qnode_key=n2)
overlay(action=compute_ngd,default_value=inf,virtual_relation_label=N2,subject_qnode_key=n3,object_qnode_key=n1)
resultify()
filter_results(action=limit_number_of_results, max_results=50, prune_kg=true)

Most or all results only have a single gene (MICE) connecting connecting Ehlers-Danlos and gatifloxacin, indicating little to no overlap in genes associated with Ehlers-Danlos and genes associated with gatifloxacin.

Here are results for a single-hop query to get genes related to gatifloxacin: https://arax.ncats.io/?r=14812

add_qnode(key=n0, categories=biolink:Gene, is_set=False)
add_qnode(key=n1, ids=CHEMBL.COMPOUND:CHEMBL31, categories=biolink:SmallMolecule)
add_qedge(key=e01, subject=n0, object=n1, predicates=biolink:related_to)
expand()
resultify()
filter_results(action=limit_number_of_results, max_results=50, prune_kg=true)

Here are results for a single-hop query to get genes related to Ehlers-Danlos: https://arax.ncats.io/?r=14813

add_qnode(key=n0, categories=biolink:Gene, is_set=False)
add_qnode(key=n2, ids=MONDO:0020066, categories=biolink:Disease)
add_qedge(key=e02, subject=n0, object=n2, predicates=biolink:related_to)
expand()
resultify()
filter_results(action=limit_number_of_results, max_results=50, prune_kg=true)
rtroper commented 3 years ago

Cystic Fibrosis (MONDO:0009061) Query 1

TRAPI 1.1 ARS pk: 9bc74b8c-c1d1-417e-8845-54a6767b8d69 ARAX results: https://arax.ncats.io/?r=15274

DSL Query targeting clinical KPs https://arax.ncats.io/?r=15277

add_qnode(ids=MONDO:0009061, key=n00)
add_qnode(categories=biolink:SmallMolecule, key=n01)
add_qedge(subject=n01, object=n00, key=e00, predicates=biolink:related_to)
expand(kp=COHD,edge_key=e00,COHD_method=paired_concept_freq)
expand(kp=ClinicalRiskKP, edge_key=e00)
overlay(action=overlay_exposures_data,subject_qnode_key=n01,object_qnode_key=n00)
overlay(action=compute_ngd, virtual_relation_label=N1, subject_qnode_key=n01, object_qnode_key=n00)
resultify()
rtroper commented 3 years ago

Proposal for additional step in workflow C

The following schematic illustrates a proposed augmented workflow that includes a final step to find published research evidence of association between a selected drug of interest and the disease of interest. Letters A-D correspond to ongoing work. Details are given below.

image

rtroper commented 3 years ago

Using Workflow Runner for Queries 1 and 2

There is now a live workflow runner endpoint: https://translator-workflow-runner.renci.org/query. See documentation here: https://translator-workflow-runner.renci.org/docs. Revised versions of Query 1 and Query 2 with workflow operations are shown below. As of now (7/29), the workflow runner endpoint is returning Internal Server Error. But, both queries work if submitted to the ARAX endpoint: https://arax.ncats.io/api/arax/v1.1/query.

Query 1 (Explore)

Query 1 with workflow operations specified is shown below. The workflow runner is currently returning Internal Server Error, but the query works if submitted to the ARAX endpoint. Results are here: https://arax.ncats.io/api/arax/v1.1/response/17141.

{
    "workflow": [
        {
            "id": "fill"
        },
        {
            "id": "bind"
        },
        {
            "id": "overlay_compute_ngd",
            "parameters": {
                "qnode_keys": [
                    "n0",
                    "n1"
                ],
                "virtual_relation_label": "NGD1"
            }
        },
        {
            "id": "complete_results"
        },
        {
            "id": "filter_results_top_n",
            "parameters": {
                "max_results": 50
            }
        }
    ],
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": [
                        "biolink:ChemicalSubstance"
                    ]
                },
                "n1": {
                    "ids": [
                        "MONDO:0005301"
                    ],
                    "categories": [
                        "biolink:Disease"
                    ]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": [
                        "biolink:related_to"
                    ]
                }
            }
        }
    }
}

Query 2 (Explain & Expand)

Query 2 with workflow operations specified is shown below. The workflow runner is currently returning Internal Server Error, but the query works if submitted to the ARAX endpoint. Results are here: https://arax.ncats.io/api/arax/v1.1/response/17162.

{
    "workflow": [
        {
            "id": "fill"
        },
        {
            "id": "bind"
        },
        {
            "id": "overlay_compute_ngd",
            "parameters": {
                "qnode_keys": [
                    "n1",
                    "n3"
                ],
                "virtual_relation_label": "NGD1"
            }
        },
        {
            "id": "overlay_compute_ngd",
            "parameters": {
                "qnode_keys": [
                    "n2",
                    "n3"
                ],
                "virtual_relation_label": "NGD2"
            }
        },
        {
            "id": "complete_results"
        },
        {
            "id": "filter_results_top_n",
            "parameters": {
                "max_results": 50
            }
        }
    ],
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": [
                        "biolink:Gene"
                    ],
                    "is_set": true
                },
                "n1": {
                    "ids": [
                        "CHEBI:45783"
                    ],
                    "categories": [
                        "biolink:ChemicalSubstance"
                    ]
                },
                "n2": {
                    "ids": [
                        "MONDO:0005301"
                    ],
                    "categories": [
                        "biolink:Disease"
                    ]
                },
                "n3": {
                    "categories": [
                        "biolink:ChemicalSubstance"
                    ]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": [
                        "biolink:related_to"
                    ]
                },
                "e02": {
                    "subject": "n0",
                    "object": "n2",
                    "predicates": [
                        "biolink:related_to"
                    ]
                },
                "e03": {
                    "subject": "n0",
                    "object": "n3",
                    "predicates": [
                        "biolink:related_to"
                    ]
                }
            }
        }
    }
}
rtroper commented 3 years ago

Psoriatic Arthritis (MONDO:0011849) Query 1

TRAPI 1.1 ARS results: https://arax.ncats.io/?source=ARS&id=f1d5abb8-ebe2-473f-ad77-26d0a2c4dfb6

DSL Query A https://arax.ncats.io/?r=17318

DSL Query B https://arax.ncats.io/?r=17319

Query 2 Results

The following is a list of drugs from Query 1. Links, below, are to Query 2 results using the respective drug. Also included for each is the number of genes (common to the drug and to psoriatic arthritis) in the knowledge graph. Note that normalized google distance was used, but not jaccard index.

jh111 commented 3 years ago

This has been replaced by the README. https://github.com/NCATSTranslator/minihackathons/tree/main/2021-12_demo/workflowC.