Closed buniello closed 3 months ago
@JarrodBaker let us know if you do have the time to have a look at this.
This endpoint will in principle be exposed in AOTF. Tagging
@d0choa , @remo87 , has this been looked into?
This hasn't been fixed.
We need to confirm whether this still applies or not
Yes it does, and it's important
After discussing the findings on the code review we're going to be adding descriptions to the arguments in the associations in the flight query. This way the users can know what to expect when they apply an argument. This is due to a difference in calculations when a Target is fixed to the scenario where a Disease is fixed.
I don't know what you found in your code review and I appreciate better documentation, but I worry that this isn't addressing the main problem. I think (some of) the results for my example query in the bug report (https://community.opentargets.org/t/spurious-indirect-association-evidence-via-graphql-api/879) are flat-out wrong. Querying disease associations for the ADAR gene (with "enableIndirect: true"), I get very high scores from "chembl" ("datasourceScores") and "known_drug" ("datatypeScores"). But as far as I can see ChEMBL doesn't list any drugs targeting ADAR (UniProt ID P55265) and as far as I know there aren't any. I think it would be worth investigating the data behind these specific scores and checking if they actually make sense.
What @remo87 is trying to document is the reason behind the behaviour you are observing.
When fixing a target entity (e.g. ADAR), the current enableIndirect: true
will propagate the evidence in the protein-protein interaction network. That means that the association might be based on proteins interacting with ADAR and not ADAR itself.
This behaviour is different than the enableIndirect: true
when fixing a disease in which the propagation of evidence is done in the disease ontology.
As discussed before, this behaviour is not exploited in the UI and the data dumps only capture the direct/indirect propagation of evidence through the disease ontology. For now, we are documenting the API endpoint to prevent more confusion. We have several streams of work to better exploit the interaction data since we know it's a relevant strategy to identify disease-relevant targets.
Does this make sense?
When fixing a target entity (e.g. ADAR), the current enableIndirect: true will propagate the evidence in the protein-protein interaction network.
@d0choa, thank you for clarifying. I was not aware of this at all.
Is there any way (through the API) to get the information that I thought I was querying, i.e. disease associations for a given target, but with evidence propagated through the disease ontology? Often this is what you want, e.g. for an association with "breast cancer" to take into account evidence for all subtypes of breast cancer.
Unfortunately, not through the API. We can have a look but we might run into performance issues. To compute this we need to propagate evidence in the ontology for every row in the heatmap not just the fixed entity as it's currently implemented.
If you work with "breast cancer" and this is your fixed entity, what you are describing is the default behaviour. You are looking at "breast cancer" and all the subtypes of breast cancer. Doing the same for the 360 diseases associated with ADAR at the same time is what is not available in the API.
It's a lot easier to do these types of queries with the data dumps, but I understand this is not your use case.
Understood. Could it work for a single target-disease association at a time (e.g. "ADAR - breast cancer")?
Following up from this community post, we plan to investigate and fix the problem causing an unexpected behaviour for the
enable indirect
API endpoint. This endpoint is not exposed in the frontend, and therefore not extensively tested.This example API query provided by the user.
``` query associatedDiseases { target(ensemblId: "ENSG00000160710") { id approvedSymbol associatedDiseases(enableIndirect: true) { count rows { score disease { id name } datasourceScores { id score } datatypeScores { id score } } } } } The output: { "data": { "target": { "id": "ENSG00000160710", "approvedSymbol": "ADAR", "associatedDiseases": { "count": 11197, "rows": [ { "score": 0.915009673664289, "disease": { "id": "EFO_0000222", "name": "acute myeloid leukemia" }, "datasourceScores": [ { "id": "chembl", "score": 0.9920171593572428 }, [...] ], "datatypeScores": [ { "id": "known_drug", ```First observations suggesting this is a backend problem:
1) From the example reported by the user, there is no chEMBL evidence displayed on the evidence page
1) BigQuery on the
associationByDatasourceIndirect
table does not return any indirect evidence neitherBigQuery response
``` SELECT * FROM `open-targets-eu-dev.platform_dev.associationByDatasourceIndirect` WHERE targetId = "ENSG00000160710" AND diseaseId = "EFO_0000222" LIMIT 10 datatypeId datasourceId diseaseId targetId score evidenceCount literature europepmc EFO_0000222 ENSG00000160710 0.656767905 6 ```