RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

why is the relationship ibuprofen=>PTGS1 not found using ChEMBL in building the KG? #219

Closed saramsey closed 6 years ago

saramsey commented 6 years ago

Maybe start looking here: https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL521

also maybe do some testing with QueryChembl.py

edeutsch commented 6 years ago

Yeah, I didn’t mention this specifically, but my interpretation is that on: https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL521 with the section:

ChEMBL is saying that with 100% confidence we know that COX inhibitor is the mechanism of action and thus there is a super high confidence interaction with the COX1 and COX2: https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL2094253 Gold. Confidence = 1.00

The list is the bottom is just additional predictions about other possible interactions. Most of which are probably wrong. But maybe worth keeping with a low confidence (as I previously said, I would arbitrarily multiple their prediction scores by a number < 1, perhaps 0.5)

So I don’t know how the API returns the data, but from the web site, I infer that the gold interactions are to be had from the “Mechanism of Action” section. Then the predictions are a separate speculative thing.

Can the API be interpreted and loaded that way?

Eric

saramsey commented 6 years ago

from @edeutsch:

Yeah, I didn’t mention this specifically, but my interpretation is that on: https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL521 with the section: image001.png ChEMBL is saying that with 100% confidence we know that COX inhibitor is the mechanism of action and thus there is a super high confidence interaction with the COX1 and COX2: https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL2094253 Gold. Confidence = 1.00

The list is the bottom is just additional predictions about other possible interactions. Most of which are probably wrong. But maybe worth keeping with a low confidence (as I previously said, I would arbitrarily multiple their prediction scores by a number < 1, perhaps 0.5)

So I don’t know how the API returns the data, but from the web site, I infer that the gold interactions are to be had from the “Mechanism of Action” section. Then the predictions are a separate speculative thing.

Can the API be interpreted and loaded that way?

Eric

jaredroach commented 6 years ago

I can find almost no evidence (e.g., PubMed, Google) that ibuprofen targets neprilysin (maybe something like PMID: 11543770, but that is a bit weak). Neprilysin gets a perfect score of "1.00" in ChEMBL "Target Predictions". My suggestion is that we NOT import ChEMBL 'Target Predictions', at least for now. Rather, just import 'ChEMBL Target'. Which for ibuprofen is the cyclooxygenases.

DeqingQu commented 6 years ago

@saramsey I queried the relationship between 'ibuprofen' and 'PTGS1' on our rtxdev database by match (:chemical_substance {name:'ibuprofen'})-[r]->(:protein {name:'PTGS1'}) return r. I got the result as following and I think the relationship between 'ibuprofen' and 'PTGS1' using ChEMBL can be found in our KG.

{
  "is_defined_by": "RTX",
  "predicate": "directly_interacts_with",
  "probability": 0.15768860613,
  "source_node_uuid": "df841cee-5907-11e8-95d6-060473434358",
  "provided_by": "ChEMBL",
  "target_node_uuid": "de830ada-5907-11e8-95d6-060473434358",
  "seed_node_uuid": "dbf1734c-5907-11e8-95d6-060473434358",
  "relation": "targets"
}

I also checked the QueryChEMBL module. When I called QueryChEMBL.get_target_uniprot_ids_for_drug('ibuprofen'), I got a response.

{
    'P08473': 0.99980037155,
    'O00763': 0.99266688751,
    'Q04609': 0.98942916865,
    'P08253': 0.94581002279,
    'P17752': 0.91994871445,
    'P03956': 0.89643421164,
    'P42892': 0.87107050119,
    'Q9GZN0': 0.86383549859,
    'P12821': 0.8620779016,
    'P15144': 0.85733534851,
    'Q9BYF1': 0.83966001458,
    'P22894': 0.78062167118,
    'P14780': 0.65826285102,
    'P08254': 0.61116303205,
    'P23219': 0.35927660575,      //    PTGS1
    'P37268': 0.25590346332,
    'P17655': 0.1909881306,
    'P07858': 0.1306186469,
    'P06734': 0.1130695383,
    'P50052': 0.111298188
}

The id of 'PTGS1' is 'UniProtKB:P23219' and it is in the query responses of the 'get_target_uniprot_ids_for_drug' with confidence = 0.35927660575.

But the difference between the confidence from API querying (0.35927660575) and the probability from our KG (0.15768860613) is too much. I think the confidence between 'ibuprofen' and 'PTGS1' should be a const and there may be a bug somewhere in our code.

I need more biological background knowledge to better understand the issues. If I misunderstood anything, please correct me. BTW, I will fix the exception bug in QueryChEMBL and write test cases for the module.

edeutsch commented 6 years ago

I think the issue is that these returned targets are merely predicted ones and their prediction scores. This list does not include what is well known and curated. If think that information is there in ChEMBL, but it is encoded in a different way. As described above. IF we stick with ChEMBL, I think we need to figure out how to capture that information with high importance/weight and confidence. these predictions far less so.

DeqingQu commented 6 years ago

The new get_mechanisms_for_chembl_id method can retrieve the mechanism of action and target of each drug. Here is a response example from ChEMBL API.

{
"mechanisms": [{
    "action_type": "INHIBITOR",
    "binding_site_comment": null,
    "direct_interaction": true,
    "disease_efficacy": true,
    "max_phase": 4,
    "mec_id": 1180,
    "mechanism_comment": null,
    "mechanism_of_action": "Cyclooxygenase inhibitor",
    "mechanism_refs": [{
        "ref_id": "0443-059748 PP. 229",
        "ref_type": "ISBN",
        "ref_url": "http://www.isbnsearch.org/isbn/0443059748"
        },  
        {
        "ref_id": "Ibuprofen",
        "ref_type": "Wikipedia",
        "ref_url": "http://en.wikipedia.org/wiki/Ibuprofen"
        }],
    "molecular_mechanism": true,
    "molecule_chembl_id": "CHEMBL521",
    "record_id": 1343587,
    "selectivity_comment": null,
    "site_id": null,
    "target_chembl_id": "CHEMBL2094253"
    }],
"page_meta": {
    "limit": 20,
    "next": null,
    "offset": 0,
    "previous": null,
    "total_count": 1
    }
}

From @saramsey

Looks like “page_meta” includes things like the limit on the result-set size (number of records and number of pages) returned by the query? Probably can be ignored.

The final return value format of the 'get_mechanisms_for_chembl_id' function is as follows.

 [
        {
    "action_type": "INHIBITOR",
    "binding_site_comment": null,
    "direct_interaction": true,
    "disease_efficacy": true,
    "max_phase": 4,
    "mec_id": 1180,
    "mechanism_comment": null,
    "mechanism_of_action": "Cyclooxygenase inhibitor",
    "mechanism_refs": [{
        "ref_id": "0443-059748 PP. 229",
        "ref_type": "ISBN",
        "ref_url": "http://www.isbnsearch.org/isbn/0443059748"
        },  
        {
        "ref_id": "Ibuprofen",
        "ref_type": "Wikipedia",
        "ref_url": "http://en.wikipedia.org/wiki/Ibuprofen"
        }],
    "molecular_mechanism": true,
    "molecule_chembl_id": "CHEMBL521",
    "record_id": 1343587,
    "selectivity_comment": null,
    "site_id": null,
    "target_chembl_id": "CHEMBL2094253"
    }
]
saramsey commented 6 years ago

@DeqingQu I have just committed code to fix issue #219 (main branch). Can you please look at it? If you don't see any issues with the code, can you also please migrate the latest commit to your branch? Thanks, Steve

saramsey commented 6 years ago

@edeutsch I have committed a code patch to address this issue. We will get it rolled into the code that is building the next version of the KG.

edeutsch commented 6 years ago

great, thanks, I look forward to seeing the new results!

saramsey commented 6 years ago

@DeqingQu can you let me know when the code has been merged into the issue249-253 branch? Thanks, Steve

DeqingQu commented 6 years ago

@saramsey The code was migrated into the issue249-253 last night.

saramsey commented 6 years ago

fixed

screen shot 2018-07-26 at 1 11 14 pm