sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

Functional enrichment analysis of genes that show differential alternative splicing patterns between treatments #102

Closed AHuffmyer closed 3 months ago

AHuffmyer commented 3 months ago

Here are lists of the genes for males and females that show different alternative splicing patterns by treatments. We can do functional enrichment, correlate with methylation, and anything else we can think of! These were generated by extracting the top 20 genes (genes showing the strongest treatment differentiation) from each pattern/PC detected by ASCA. For females, there were 40 genes listed and for males there are also 40 genes listed. If there aren't enough genes to do functional enrichment let me know, I can extract more. I went for 20 for our first pass.

The gene lists can be found here:

Females: https://github.com/sr320/ceabigr/blob/main/output/77-asca-exon/females_splicing_treatments_gene_list.csv

Males: https://github.com/sr320/ceabigr/blob/main/output/77-asca-exon/males_splicing_treatments_gene_list.csv

sr320 commented 3 months ago

@AHuffmyer Can you also provide a list of genes that did not differ for comparison (methylation and functional enrichment). Patterns were tightly correlated.

AHuffmyer commented 3 months ago

Done!

Females: https://github.com/sr320/ceabigr/blob/main/output/77-asca-exon/females_splicing_nodifference_gene_list.csv Males: https://github.com/sr320/ceabigr/blob/main/output/77-asca-exon/males_splicing_nodifference_gene_list.csv

yaaminiv commented 3 months ago

@AHuffmyer @sr320 I conduced an enrichment test using genes with and without a difference as the gene background, and alternatively spliced genes as the genes of interest. There were no enriched terms in males or females.

I'm wondering if I should conduct the enrichment using a different background (ie all genes used in the analysis) instead of just the top predictors from each PC. However, the way I did it (gene background = top predictors from each PC irrespective of splicing pattern) is probably the most conservative. Open to thoughts!

sr320 commented 3 months ago

Do all genes as background

AHuffmyer commented 3 months ago

Did you take a look at the functional annotation for that list of genes? I'm just curious if there are any genes in particular that are interesting. We could describe individual genes that show up in that list even if there are no enriched terms.

sr320 commented 3 months ago

Agree that we should minimally annotate them.... @yaaminiv is there a table with LOC_ID and protein annotation, GO information?

yaaminiv commented 3 months ago

Agree that we should minimally annotate them.... @yaaminiv is there a table with LOC_ID and protein annotation, GO information?

Not yet but I'm in the process of making it!

yaaminiv commented 3 months ago

used all genes tested in ASCA as background. got some enrichment:

Female:

Male:

AHuffmyer commented 3 months ago

Cool! Do we have descriptions for molecular/biological level functions for these GO terms? Maybe in a table?

yaaminiv commented 3 months ago

Yes! They're in the GOterm results table without annotations. I'm going to make a master table with all enrichment results + GO term names + gene product information so we can compare across tests and hone in on important functions. #105