RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 20 forks source link

Add ability to call out to KG2 in Expander #641

Closed dkoslicki closed 4 years ago

dkoslicki commented 4 years ago

Maybe an option such as:

expand(edge_id=e00, KP=ARAX/KG1)

or

expand(edge_id=e00, KP=ARAX/KG2)

with KP being optional and defaulting to KG1. Fingers crossed that QueryGraphReasoner() "just works" on KG2.

dkoslicki commented 4 years ago

@amykglen and just FYI, the brief_description will need to be updated here in ARAX_messenger to clarify that we can also call out to KG2.

amykglen commented 4 years ago

yep, thanks - will do that once it's ready to use!

amykglen commented 4 years ago

Ok, so this is working now (in the expander branch) - the command looks like this:

expand(edge_id=e00, kp=ARAX/KG2)

(Though you can of course leave the kp parameter out and it will default to KG1.) I also updated the auto-documentation in light of this (let me know if that doesn't look right).

I need to do some more rigorous testing, but it seems to be working ok thus far. :)

I ended up not using QueryGraphReasoner for it because there are differences in attribute names in KG1 vs. KG2 that were causing some problems (e.g., node uri is called iri in KG2), and rather than update QGR to handle that I thought I may as well start with a more tailored Expander-specific version. It currently still uses some of the same pieces that QGR does (the same cypher generation code, for example), some of which could likely be streamlined further for Expander down the line, but was a convenient starting point.

dkoslicki commented 4 years ago

@amykglen can you merge demo into expander and then do the following:

cd /RTX/data/KGmetadata
python dumpdata.py  # and wait about 20 min
cd RTX/code/reasoningtool/kg-construction
python KGNodeIndex.py -b  # and wait about 40 min
cd RTX/code/ARAX/ARAXQuery 20  # and feel free to edit this example for further testing or if I'm not calling your expander code correctly

and see if the result returned is non-empty? This will test if our #652 is working as intended with your code.

dkoslicki commented 4 years ago

@amykglen

I also updated the auto-documentation in light of this (let me know if that doesn't look right).

Also, feel free to, in expander, re-run document_dsl_commands.py, and see if the resulting DSL_documentation.md is to your liking with this additional parameter.

amykglen commented 4 years ago

@dkoslicki - tested as you advised and it does seem to be working (non-empty results) - here's the output of ARAX_query.py 20:

Response:
  status: OK
  n_errors: 0  n_warnings: 0  n_messages: 30
  - 2020-02-29 09:23:09.190411 INFO: ARAXQuery launching
  - 2020-02-29 09:23:09.190440 INFO: Examine input query for needed information for dispatch
  - 2020-02-29 09:23:09.190449 INFO: Found input processing plan. Sending to the ProcessingPlanExecutor
  - 2020-02-29 09:23:09.190456 DEBUG: Entering executeProcessingPlan
  - 2020-02-29 09:23:09.353376 DEBUG: No starting messages were referenced. Will start with a blank template Message
  - 2020-02-29 09:23:09.354183 DEBUG: Found processing_actions
  - 2020-02-29 09:23:09.354205 INFO: Parsing input actions list
  - 2020-02-29 09:23:09.354214 DEBUG: Parsing action: create_message
  - 2020-02-29 09:23:09.354801 DEBUG: Parsing action: add_qnode(name=CUI:C1452002, id=n00)
  - 2020-02-29 09:23:09.355647 DEBUG: Parsing action: add_qnode(type=chemical_substance, is_set=true, id=n01)
  - 2020-02-29 09:23:09.355709 DEBUG: Parsing action: add_qedge(source_id=n00, target_id=n01, id=e00, type=interacts_with)
  - 2020-02-29 09:23:09.355746 DEBUG: Parsing action: expand(edge_id=e00, kp=ARAX/KG2)
  - 2020-02-29 09:23:09.355770 DEBUG: Parsing action: return(message=true, store=false)
  - 2020-02-29 09:23:09.360432 DEBUG: Considering action 'create_message' with parameters None
  - 2020-02-29 09:23:09.360468 INFO: Creating an empty template ARAX Message
  - 2020-02-29 09:23:09.361036 DEBUG: Considering action 'add_qnode' with parameters {'name': 'CUI:C1452002', 'id': 'n00'}
  - 2020-02-29 09:23:09.361066 INFO: Adding a QueryNode to Message with parameters {'id': 'n00', 'curie': None, 'name': 'CUI:C1452002', 'type': None, 'is_set': None}
  - 2020-02-29 09:23:09.361588 DEBUG: Looking up CURIE CUI:C1452002 in KgNodeIndex
  - 2020-02-29 09:23:09.378208 DEBUG: Considering action 'add_qnode' with parameters {'type': 'chemical_substance', 'is_set': 'true', 'id': 'n01'}
  - 2020-02-29 09:23:09.378278 INFO: Adding a QueryNode to Message with parameters {'id': 'n01', 'curie': None, 'name': None, 'type': 'chemical_substance', 'is_set': 'true'}
  - 2020-02-29 09:23:09.379082 DEBUG: Considering action 'add_qedge' with parameters {'source_id': 'n00', 'target_id': 'n01', 'id': 'e00', 'type': 'interacts_with'}
  - 2020-02-29 09:23:09.379163 INFO: Adding a QueryEdge to Message with parameters {'id': 'e00', 'source_id': 'n00', 'target_id': 'n01', 'type': 'interacts_with'}
  - 2020-02-29 09:23:09.379255 DEBUG: Considering action 'expand' with parameters {'edge_id': 'e00', 'kp': 'ARAX/KG2'}
  - 2020-02-29 09:23:09.379280 DEBUG: Applying Expand to Message with parameters {'edge_id': 'e00', 'kp': 'ARAX/KG2'}
  - 2020-02-29 09:23:09.388861 INFO: Sending this query graph to KG2Querier: {'nodes': [{'id': 'n00', 'curie': 'CUI:C1452002', 'type': 'chemical_substance', 'is_set': None}, {'id': 'n01', 'curie': None, 'type': 'chemical_substance', 'is_set': True}], 'edges': [{'id': 'e00', 'type': 'interacts_with', 'relation': None, 'source_id': 'n00', 'target_id': 'n01', 'negated': None}]}
  - 2020-02-29 09:23:09.388894 DEBUG: Generating cypher based on query graph sent to KG2Querier
  - 2020-02-29 09:23:09.389113 DEBUG: Sending cypher query to KG2 neo4j
  - 2020-02-29 09:23:14.045340 INFO: Query returned 2 nodes and 1 edges
  - 2020-02-29 09:23:14.046509 INFO: After expansion, Message.KnowledgeGraph has 2 nodes and 1 edges
  - 2020-02-29 09:23:14.046542 DEBUG: Considering action 'return' with parameters {'message': 'true', 'store': 'false'}

Number of results: 0
For example 15 (demo eg. 3), number of TP proteins: 0
Number of KnowledgeProviders in KG: Counter({"['https://skr3.nlm.nih.gov/SemMedDB']": 1})

and this is what the resulting message.knowledge_graph looks like:

{
   'edges':[
      {
         'confidence':None,
         'defined_datetime':None,
         'edge_attributes':None,
         'evidence_type':None,
         'id':'20451609',
         'is_defined_by':'ARAX/KG2',
         'negated':False,
         'provided_by':"['https://skr3.nlm.nih.gov/SemMedDB']",
         'publications':"['PMID:19211577']",
         'qedge_id':'e00',
         'qualifiers':None,
         'relation':'https://skr3.nlm.nih.gov/SemMedDB#interactsWith',
         'source_id':'CUI:C1452002',
         'target_id':'CUI:C0178735',
         'type':'interacts_with',
         'weight':None
      }
   ],
   'nodes':[
      {
         'description':None,
         'id':'CUI:C1452002',
         'name':'Iclaprim',
         'node_attributes':None,
         'qnode_id':'n00',
         'symbol':None,
         'type':[
            'chemical_substance'
         ],
         'uri':'https://identifiers.org/umls/cui/C1452002'
      },
      {
         'description':None,
         'id':'CUI:C0178735',
         'name':'macromolecule',
         'node_attributes':None,
         'qnode_id':'n01',
         'symbol':None,
         'type':[
            'chemical_substance'
         ],
         'uri':'https://identifiers.org/umls/cui/C0178735'
      }
   ]
}
dkoslicki commented 4 years ago

Nicely done! Glad to see that come together so quickly!

amykglen commented 4 years ago

Alright, this is done and merged into demo - closing.