moosa-r / rbioapi

rbioapi: User-Friendly R Interface to Biologic Web Services' API
https://rbioapi.moosa-r.com/
GNU General Public License v3.0
17 stars 1 forks source link

getting genes associated with PANTHER terms #3

Closed ktyssowski closed 2 years ago

ktyssowski commented 2 years ago

thanks for this helpful tool! after running rba_panther_enrich(), how can I get the list of genes in my gene list that are associated with each significant term? I can't seem to find a function that will allow that. Thanks!

moosa-r commented 2 years ago

Dear @ktyssowski

Unfortunately, this PANTHER API endpoint doesn't return the user's gene within each significant term. However, In case you are performing an over-representation analysis of your genes against GO terms, I think you have two options here:

  1. In rbioapi, use other enrichment services. Please see this article: https://rbioapi.moosa-r.com/articles/rbioapi_do_enrich.html

  2. If you need to specifically use PANTHER tools, you may do something along this line: I. Perform enrichment analysis using panther. II. Retrieve the genes in each GO term using biomaRt or AnnotationHub. III. Do a simple set intersection with your input genes.

I wouldn't recommend the second approach since there can be version differences, and you may fall into reproducibility issues. Nevertheless, If you need to, I can write a code snippet to do this step.

P.S: I have checked the Panther API references to make sure that this issue is not related to rbioapi implementation of their services. rbioapi is a bit behind as 3 endpoints are not implemented. However, this was not related to this issue.

erzakiev commented 1 year ago

This is also interesting to me. I wonder if the Panther API still doesn't provide the genes that were 'hits' in the enriched terms?

And I'd prefer to still use Panther and its 'GO slims' databases as they seem to provide very eloquent (i.e. only a few lines) lists of terms instead of those lists of GO terms as provided, for example, by a simple over-representation search through the C2 collection of MSigDb using clusterProfiler.

moosa-r commented 1 year ago

Dear @erzakiev I see your point. The response format has"t changed. Adding an extra column to the response with genes hits is a trivial task. But to prevent future problems and keep the package maintainable, I need to be utterly faithful to the response format of the supported API.

If you need to use PANTHER via rbioapi, You can use something like the workaround I proposed above. But please keep in mind an important caveat: make sure to use the same release that PANTHER uses to map the GO terms to your genes.