kevinxin90 / RTX_BioThings_Explorer

Integrating BioThings Explorer with RTX Reasoning Tools
Apache License 2.0
1 stars 1 forks source link

QueryPC2::uniprot_id_to_reactome_pathways returns more results than RTX #1

Closed erikyao closed 6 years ago

erikyao commented 6 years ago

Hi Kevin,

MWE in RTX: pc.uniprot_id_to_reactome_pathways("P68871").

MWE in BioThings:

I don't know which side is correct, RTX or BioThings. Could you please check it out? Thanks!

Best, Yao

kevinxin90 commented 6 years ago

@erikyao Hi Yao, my understanding is that you should trust the results from BioThings Explorer (which contains both json['searchHit']['pathway'] and json['serachHit']['uri']).

When you make a query like this: http://www.pathwaycommons.org/pc2/search.json?q=P68871&type=pathway.

I am using one part of JSON output as example here: `{ "uri": "http://identifiers.org/reactome/R-HSA-6798695", "biopaxClass": "Pathway", "name": "Neutrophil degranulation", "dataSource": [ "http://pathwaycommons.org/pc2/reactome" ], "organism": [ "http://identifiers.org/taxonomy/9606" ], "pathway": [ "http://identifiers.org/reactome/R-HSA-168249", "http://identifiers.org/reactome/R-HSA-168256" ], "excerpt": null, "size": 1319, "numParticipants": 1309, "numProcesses": 10 }, You could double check the web page from reactome:http://reactome.org/content/detail/R-HSA-6798695. When you scroll down to the row 'Participant of', you should see 'Innate Immune System'. (This is the parent pathway of R-HSA-6798695). And if you click on it, it should lead you to http://reactome.org/content/detail/R-HSA-168249. And if you look for the parent for R-HSA-168249, you should find R-HSA-168256. Given that, I think when you are querying for all pathways which a protein/gene participates in, you should both include all the results. In the current parser in RTX, you are only including the parents results, which is not sufficient. And BTW, because the PC API is not well designed (e.g. they use URI to represent values), I chose not to include them in BioThings Explorer. So when you are querying for uniprot-reactome association, you are directly getting results from MyGene.info. And the data source of MyGene.info for pathway directly comes from Reactome, which you surely can trust. Let me know if you have any further questions!

erikyao commented 6 years ago

Good. I’ll record this as a bug of RTX.

Thank you, Kevin.

On Thu, May 31, 2018 at 7:32 PM Kevin Xin notifications@github.com wrote:

@erikyao https://github.com/erikyao Hi Yao, my understanding is that you should trust the results from BioThings Explorer (which contains both json['searchHit']['pathway'] and json['serachHit']['uri']).

When you make a query like this: http://www.pathwaycommons.org/pc2/search.json?q=P68871&type=pathway.

-

The['searchHit']['uri'] represents the direct pathway which this protein 'P68871' is involved, e.g. R-HSA-6798695.

And the ['searchHit']['pathway'] field represents the parent pathways of 'R-HSA-6798695', e.g. R-HSA-168249, R-HSA-168256.

I am using one part of JSON output as example here: `{ "uri": "http://identifiers.org/reactome/R-HSA-6798695", "biopaxClass": "Pathway", "name": "Neutrophil degranulation", "dataSource": [ "http://pathwaycommons.org/pc2/reactome" ], "organism": [ "http://identifiers.org/taxonomy/9606" ], "pathway": [ "http://identifiers.org/reactome/R-HSA-168249", "http://identifiers.org/reactome/R-HSA-168256" ], "excerpt": null, "size": 1319, "numParticipants": 1309, "numProcesses": 10 }, You could double check the web page from reactome: http://reactome.org/content/detail/R-HSA-6798695. When you scroll down to the row 'Participant of', you should see 'Innate Immune System'. (This is the parent pathway of R-HSA-6798695). And if you click on it, it should lead you to http://reactome.org/content/detail/R-HSA-168249. And if you look for the parent for R-HSA-168249, you should find R-HSA-168256. Given that, I think when you are querying for all pathways which a protein/gene participates in, you should both include all the results. In the current parser in RTX, you are only including the parents results, which is not sufficient. And BTW, because the PC API is not well designed (e.g. they use URI to represent values), I chose not to include them in BioThings Explorer. So when you are querying for uniprot-reactome association, you are directly getting results from MyGene.info. And the data source of MyGene.info for pathway directly comes from Reactome, which you surely can trust. Let me know if you have any further questions!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kevinxin90/RTX_BioThings_Explorer/issues/1#issuecomment-393740272, or mute the thread https://github.com/notifications/unsubscribe-auth/AElzcAe8ku5yklJIFcLdD3qCN0k_KIIJks5t4KfCgaJpZM4UV_JE .