Closed tmmurali closed 3 years ago
Let us catalogue gene sets here. We need to download each one (see #5) and add it to the enrichment analysis.
Currently the downloadable gmt file available for the COVID-19 Crowd Generated Gene sets does not have the main descriptor text of the gene set in the file, making most gene sets unidentifiable.
I made an issue on their repo (#82) asking them to fix it.
@jlaw9 @n-tasnina what is the status of running our enrichment pipeline on the COVID-19 gene sets?
We have the COVID-19 gene sets in GMT format, just need to update our scripts to test for enrichment of them. Here's the clusterProfiler documentation for our own gene sets.
@n-tasnina can you add a function for that in our enrichment.py
?
Yeah, sure.I will add a function in enrichment.py to do this.
On Fri, May 22, 2020, 2:27 PM Jeff Law notifications@github.com wrote:
We have the COVID-19 gene sets in GMT format, just need to update our scripts to test for enrichment of them. Here's the clusterProfiler documentation for our own gene sets https://guangchuangyu.github.io/2015/05/use-clusterprofiler-as-an-universal-enrichment-analysis-tool/ . @n-tasnina https://github.com/n-tasnina can you add a function for that in our enrichment.py?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Murali-group/SARS-CoV-2-network-analysis/issues/6#issuecomment-632841259, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANSAMM23BOLFRTZ2HLHTQ7LRS272RANCNFSM4LZYUK7A .
We can close this issue as well. Here is the link to the python script where we did enrichment analysis. https://github.com/Murali-group/SARS-CoV-2-network-analysis/blob/enrichment/src/Enrichment/fss_enrichment.py
We have a ranked list of predictions coming from network propagation or from host-virus PPI prediction. This issue is relevant mainly for human proteins. We also have a set of gene sets, e.g., from https://amp.pharm.mssm.edu/covid19/. We want to assess to what extent each gene set is enriched in our list of predictions.
There are two approaches I suggest:
We must correct for testing multiple hypotheses.