chembl / chembl_webresource_client

Official Python client for accessing ChEMBL API
https://www.ebi.ac.uk/chembl/api/data/docs
Other
359 stars 95 forks source link

Mechanism results do not match what is on the website #117

Closed BartoszBartmanski closed 2 years ago

BartoszBartmanski commented 2 years ago

When running the following:

from chembl_webresource_client.new_client import new_client

mechanism_res = new_client.mechanism.filter(drug_chembl_id="CHEMBL269732")
mechanism_ids = {x["target_chembl_id"] for x in tqdm(mechanism_res)}
len(mechanism_ids)

I get 1358 unique target_chembl_id entries, even though on the website there is only one entry

juanfmx2 commented 2 years ago

Unfortunately, when an unknown parameter/filter is included in the query it is ignored. In this case drug_chembl_id is not a property present in the mechanism endpoint (https://www.ebi.ac.uk/chembl/api/data/mechanism.json), and for that reason you are retrieving all the records from it instead of the ones related with CHEMBL269732. You should use the property parent_molecule_chembl_id instead.

from chembl_webresource_client.new_client import new_client

mechanism_res = new_client.mechanism.filter(parent_molecule_chembl_id="CHEMBL269732")
mechanism_ids = {x["target_chembl_id"] for x in tqdm(mechanism_res)}
len(mechanism_ids)

I hope this helps, we are considering including an error message that lets you know that an unknown filter is being used.

BartoszBartmanski commented 2 years ago

Thanks for your help!

However, I encountered another related problem - when I try to search for targets of a compound I get an empty list even though on the website it shows 325 targets:

target_res = new_client.target.filter(target_chembl_id="CHEMBL269732")
print(target_res)

and target_chembl_id is present at https://www.ebi.ac.uk/chembl/api/data/target.json

juanfmx2 commented 2 years ago

this might be a little more complex, since we do additional processing to display information on the web interface from information that is on the web services/database.

CHEMBL269732 is an identifier for a compound, it is not an identifier for targets, so you can't use it to filter on the target endpoint directly

In order to filter targets based on a compound identifier, you have 2 options use the Mechanisms of Action endpoint or the Assay-Activity endpoints.

Mechanisms of Action https://www.ebi.ac.uk/chembl/api/data/mechanism.json?parent_molecule_chembl_id=CHEMBL269732

Activity https://www.ebi.ac.uk/chembl/api/data/activity.json?parent_molecule_chembl_id=CHEMBL269732

From here you could collect the target_chembl_id and later on use them to filter on the target endpoint:

https://www.ebi.ac.uk/chembl/api/data/target.json?target_chembl_id__in=CHEMBL1902,CHEMBL4445,CHEMBL4438

I hope this helps.