biothings / pending.api

Set of standalone APIs built with the BioThings SDK for the Translator Project
https://biothings.ncats.io
Apache License 2.0
5 stars 11 forks source link

more specific operations for MyChem `chembl.drug_mechanisms` data #100

Closed colleenXu closed 7 months ago

colleenXu commented 1 year ago

Related to https://github.com/biothings/BioThings_Explorer_TRAPI/issues/532#issuecomment-1358768948

MyChem chembl.drug_mechanisms data, in subject-association-object format

colleenXu commented 1 year ago

Related issue: https://github.com/biothings/BioThings_Explorer_TRAPI/issues/316, JQ development work

colleenXu commented 1 year ago

One reason to create a pending api / adjust the parser for that resource: the gene ID being CHEMBL.TARGET is a problem that isn't solved by post-processing (JQ).

newgene commented 1 year ago

Cross-ref a related issue here: https://github.com/biothings/mygene.info/issues/105 (mapping from CHEMBL Target ID to gene id)

colleenXu commented 1 year ago

Notes on the data

(6306 records total https://www.ebi.ac.uk/chembl/g/#browse/mechanisms_of_action/)

Drugs

Not all drugs are "Small molecule". In rough order most to least:

Targets

Not all targets are human stuff. In rough order from most to least:

Mot all targets are proteins. In rough order from most to least:

categories of drug mechanisms

when browsing chembl https://www.ebi.ac.uk/chembl/g/#browse/mechanisms_of_action (roughly in order of most to least):

🧇 not as helpful for the creative-mode issue 532

colleenXu commented 1 year ago

Example of current MyChem structure vs association-based structure

Background

ANG1005 (CHEMBL1089636) has two drug-mechanisms that are different categories:

In MyChem

these two drug-mechanisms are nested inside the chembl.drug_mechanisms field of the MyChem record for this chemical: https://mychem.info/v1/query?q=_exists_:%22chembl.drug_mechanisms%22%20AND%20chembl.molecule_chembl_id:CHEMBL1089636. This means BTE post-processing (JQ?) is needed to retrieve only the INHIBITOR drug-mechanism (or vice versa).

{
    "chembl": {
        "molecule_chembl_id": "CHEMBL1089636",
        "drug_mechanisms": [
            {"action_type": "INHIBITOR", "references": {...}, "binding_site_name": null, "target_chembl_id": "CHEMBL2095182", "target_uniprot_accession": ["P68371", ...]},
            {"action_type": "BINDING_AGENT", "references": {...}, "binding_site_name": null, "target_chembl_id": "CHEMBL4630884", "target_uniprot_accession": "Q07954"}
        ]
    }
}

association-based structure

However, if we use an association-based structure, we can make two separate records. And these two records can be retrieved separately depending on what association.action_type is set to when querying.

{
    "subject": { "drug_chembl_id": "CHEMBL1089636", ...},
    "association": { "action_type": "INHIBITOR", "references": {...}, "binding_site_name": null},
    "object": { "target_chembl_id": "CHEMBL2095182", "target_uniprot_accession": ["P68371", ...]}
},
{
    "subject": { "drug_chembl_id": "CHEMBL1089636", ...},
    "association": { "action_type": "BINDING_AGENT", "references": {...}, "binding_site_name": null},
    "object": { "target_chembl_id": "CHEMBL4630884", "target_uniprot_accession": "Q07954"}
}
rjawesome commented 1 year ago

I've started a pending API python script if the association-based structure is preferred. Will post repo soon.

colleenXu commented 1 year ago

Note, I'm not sure how to handle the records that may lack an "action_type" value...they seem to lack a lot of information...

https://mychem.info/v1/query?q=_exists_:chembl.drug_mechanisms%20AND%20(NOT%20_exists_:%22chembl.drug_mechanisms.action_type%22)&fields=chembl Screen Shot 2022-12-28 at 8 38 58 PM

rjawesome commented 1 year ago

See https://github.com/rjawesome/mychem-drug-mechanisms

colleenXu commented 1 year ago

At the moment, creating a new API is not necessary.

colleenXu commented 1 year ago

Leaving open; in the future, we may want to write more specific operations (see the third bullet point "haven't done this yet" in the post above). I therefore moved this issue to "on-hold"

colleenXu commented 7 months ago

Closing for now: will open another issue to consider writing more specific operations using filter/jmespath