cannin / enhance_nlp_interaction_network_gsoc2020

3 stars 4 forks source link

Add Count of INDRA Statements for Individual Terms #10

Open cannin opened 4 years ago

cannin commented 4 years ago

Add another column INDRA_QUERY_TERM_STATEMENT_COUNT, use the following example code:

import requests 
from urllib.parse import urljoin

grounding_service_url = ''

txt = 'BRAF'
txt = 'topotecan'

resp =, 'ground'), json={'text': txt})
grounding_results = resp.json()

# TODO: Test if grounding_results has entries
term_id = grounding_results[0]['term']['id']
term_db = grounding_results[0]['term']['db']
term = term_id + '@' + term_db

# Get statements for query term 
out = indra_db_rest.get_statements(agents=[term])
cannin commented 4 years ago

The harder challenge: Only return back statements from specific source_apis (e.g., reach). Like this one:

        "evidence": [
                "source_api": "reach",
                "pmid": "28972042",
                "text": "TMCO1 dysregulates cell cycle progression via suppression of the AKT pathway, and S60 of the TMCO1 protein is crucial for its tumor suppressor roles.",
                "annotations": {
                    "found_by": "Negative_activation_syntax_1_verb",
                    "agents": {
                        "raw_text": [
                            "cell cycle"

I converted the statements to_json with 'from indra.statements.statements import stmts_to_json'. We might try to submit a PR related to this.

cannin commented 4 years ago

You might want to message INDRA team to see if they have this already somewhere; some function to filter statements based on some properties; it should be a pretty independent function.

I have tackled similar challenges with jsonpath ( not sure if it will work here. You might want to mention this as well; INDRA might not want the extra dependency. Example code:

import json
from jsonpath_ng import jsonpath
from jsonpath_ng.ext import parse

def get_jsonpath(json_file, json_str, jsonpath_expr_str): 
    if json_file is None: 
        dat = json.loads(json_str)
        with open(json_file) as f:
            dat = json.load(f)

    jsonpath_expr = parse(jsonpath_expr_str)

    results = jsonpath_expr.find(dat)

    results_list = []

    for match in results:


if __name__ == "__main__":

    # json_file = 'covid19_model_2020-03-22-03-16-47.json'
    # jsonpath_expr_str = "$..text_refs"
    # jsonpath_expr_str = "$..stmts[?(@.belief == 1)]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.type == 'IncreaseAmount')]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.obj.db_refs.UP == 'P16278')]"
    # jsonpath_expr_str = "$..stmts[?(@.stmt.evidence[*].text_refs.PMCID == 'PMC331007')]"

    json_file = None
    json_str = '[{"id": "a", "foo": [{"baz": 1}, {"baz": 2}]}, {"id": "b", "foo": [{"baz": 3}, {"baz": 4}]}]'
    jsonpath_expr_str = '$[*].baz'
    jsonpath_expr_str = '$[?( == "a")].foo'

    get_jsonpath(json_file, json_str, jsonpath_expr_str)
cannin commented 4 years ago

This JSONPath expression retrieves what I'd like:

jsonpath_expr_str = "$[?(@.evidence[*].source_api == 'reach')]"
PritiShaw commented 4 years ago

This JSONPath expression retrieves what I'd like:

jsonpath_expr_str = "$[?(@.evidence[*].source_api == 'reach')]"

Hi Mentor I have received reply from Ben regarding our query ( He said about method, source_apis, policy='one', **kwargs) image

This is also implemented in the INDRA REST API ,documented at, under the "Preassembly" heading. image