Closed lcolladotor closed 3 years ago
An example 2x2 table could be like this:
table(topTable$adj.P.val < 0.05, rowRanges(rse_gene)$gene_id %in% cell_marker_gene_ids)
where cell_marker_gene_ids
changes for every cell type. We might want to consider the top 100 marker genes per cell type, unlike the top 25 like we did in #7.
note we already did CSEA of BPD in matt's preprint and you can probably just reuse or point to results of that effort (Figure 4, Table S6) there wasnt any enrichment of microglia for bipolar in any brain region [image: image.png]
On Thu, Apr 8, 2021 at 2:23 PM Leonardo Collado-Torres < @.***> wrote:
An example 2x2 table could be like this:
table(topTable$adj.P.val < 0.05, rowRanges(rse_gene)$gene_id %in% marker_gene_ids)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LieberInstitute/zandiHyde_bipolar_rnaseq/issues/10#issuecomment-816040456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEYY2BACVQR27PKATSZMSTTHXYD5ANCNFSM42TL5DSQ .
Hi,
Louise and I just met to talk about this. We'll likely also need to talk about it with Peter.
I didn't remember the analysis from Matt's pre-print https://www.biorxiv.org/content/10.1101/2020.10.07.329839v1.full so thanks for bringing it up Andrew!
With that in mind, we still want to make a plot like Figure 6 from the spatial transcriptomics paper or Figure 4 / Figure S12 from Matt's pre-print. That involves using:
gene_list = list()
with 4 elements (bipolar DE genes up in AMY, DE genes down in AMY, DE genes up in sACC, and DE genes down in sACC) (focusing only on DE genes for now, not exons, junctions, transcripts), each element being the Ensembl gene IDs. Then modeling_results
would need to be a list()
with 1 data.frame that uses the same column names we had in spatialLIBD::fetch_data(type = "modeling_results")
. Here we can hack our list of cell type marker genes to fit this object structure. Or we can use some of the internal code at https://github.com/LieberInstitute/spatialLIBD/blob/master/R/gene_set_enrichment.R#L88-L108 without having to hack the inputs.Then we can check the cell type marker genes vs DE results and see if they match (likely not) the cell type marker genes (defined by Matt; different from the ones we are using in #7) vs GWAS risk genes (through MAGMA).
Hm... I think that we can simplify this a bit. We can use Matt's data to find the list of cell-type marker genes using the broad cell types. Those are the same genes used for #7. We can then test for each cell type if they are enriched among the DE genes (like compute an odds ratio from 2 x 2 tables followed by http://research.libd.org/jaffelab/reference/getOR.html), or do a GSEA with the DE t-tests. For the latter, https://bioconductor.github.io/BiocWorkshops/functional-enrichment-analysis-of-high-throughput-omics-data.html#functional-class-scoring-permutation-testing seems quite straight forward though I haven't used it myself.
I think that this would be ok as the reviewer is interested in more analyses showing how microglia genes are linked to BD.
So then it becomes: