Microglia and DE genes - Githubissues

lcolladotor commented 3 years ago

The authors are arguing that microglia are implicated in BD by pointing to the over-representation of microglia-enriched genes in one WGCNA module associated with BD. There are more direct ways of examining this, such as by testing for over-representation of genes implicated in BD by GWAS or DE studies within cell-type specific markers, as has been done before (e.g. https://doi.org/10.1007/s12035-020-01879-5, https://doi.org/10.1038/s41588-018-0129-5).

[x] Test whether DE genes in either brain region are enriched for cell-type specific marker genes
[ ] PZ will forward to LC results we have from BB on cell-type enrichment of WGCNA modules
[ ] Review to see if we can get list of marker genes and use this to test for cell-type enrichment of DE genes; will test only gene features and will particularly focus on microglia cell type markers to respond to the question

Hm... I think that we can simplify this a bit. We can use Matt's data to find the list of cell-type marker genes using the broad cell types. Those are the same genes used for #7. We can then test for each cell type if they are enriched among the DE genes (like compute an odds ratio from 2 x 2 tables followed by http://research.libd.org/jaffelab/reference/getOR.html), or do a GSEA with the DE t-tests. For the latter, https://bioconductor.github.io/BiocWorkshops/functional-enrichment-analysis-of-high-throughput-omics-data.html#functional-class-scoring-permutation-testing seems quite straight forward though I haven't used it myself.

I think that this would be ok as the reviewer is interested in more analyses showing how microglia genes are linked to BD.

So then it becomes:

[x] Find broad cell type marker genes from #7.
[x] For each cell type and brain region, compute 2x2 tables of DE by cell marker status, then also compute odd ratios.
[ ] For each cell type and brain region, compute GSEA (gene-level only).

lcolladotor commented 3 years ago

An example 2x2 table could be like this:

table(topTable$adj.P.val < 0.05, rowRanges(rse_gene)$gene_id %in% cell_marker_gene_ids)

where cell_marker_gene_ids changes for every cell type. We might want to consider the top 100 marker genes per cell type, unlike the top 25 like we did in #7.

andrewejaffe commented 3 years ago

note we already did CSEA of BPD in matt's preprint and you can probably just reuse or point to results of that effort (Figure 4, Table S6) there wasnt any enrichment of microglia for bipolar in any brain region [image: image.png]

On Thu, Apr 8, 2021 at 2:23 PM Leonardo Collado-Torres < @.***> wrote:

An example 2x2 table could be like this:

table(topTable$adj.P.val < 0.05, rowRanges(rse_gene)$gene_id %in% marker_gene_ids)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LieberInstitute/zandiHyde_bipolar_rnaseq/issues/10#issuecomment-816040456, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEYY2BACVQR27PKATSZMSTTHXYD5ANCNFSM42TL5DSQ .

lcolladotor commented 3 years ago

Hi,

Louise and I just met to talk about this. We'll likely also need to talk about it with Peter.

I didn't remember the analysis from Matt's pre-print https://www.biorxiv.org/content/10.1101/2020.10.07.329839v1.full so thanks for bringing it up Andrew!

Figure 4 https://www.biorxiv.org/content/biorxiv/early/2020/10/08/2020.10.07.329839/F4.large.jpg?width=800&height=600&carousel=1 panel B shows the relationship between cell types and the bipolar GWAS using MAGMA. Figure S12 https://www.biorxiv.org/content/biorxiv/early/2020/10/08/2020.10.07.329839/F16.large.jpg?width=800&height=600&carousel=1 panel A shows the same results for sACC. In both, we see no relationship between microglia and genes linked to bipolar risk (from MAGMA using the GWAS info).
We know from before that genetic risk and expression give different results. Like at https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41593-020-00787-0/MediaObjects/41593_2020_787_Fig6_HTML.png?as=webp and Figure S47 from https://www.cell.com/cms/10.1016/j.neuron.2019.05.013/attachment/0f56af5c-ec8d-42eb-b6d8-33bf7495e635/mmc1.

With that in mind, we still want to make a plot like Figure 6 from the spatial transcriptomics paper or Figure 4 / Figure S12 from Matt's pre-print. That involves using:

http://research.libd.org/spatialLIBD/reference/gene_set_enrichment.html where gene_list = list() with 4 elements (bipolar DE genes up in AMY, DE genes down in AMY, DE genes up in sACC, and DE genes down in sACC) (focusing only on DE genes for now, not exons, junctions, transcripts), each element being the Ensembl gene IDs. Then modeling_results would need to be a list() with 1 data.frame that uses the same column names we had in spatialLIBD::fetch_data(type = "modeling_results"). Here we can hack our list of cell type marker genes to fit this object structure. Or we can use some of the internal code at https://github.com/LieberInstitute/spatialLIBD/blob/master/R/gene_set_enrichment.R#L88-L108 without having to hack the inputs.
Make the plot with http://research.libd.org/spatialLIBD/reference/gene_set_enrichment_plot.html

Then we can check the cell type marker genes vs DE results and see if they match (likely not) the cell type marker genes (defined by Matt; different from the ones we are using in #7) vs GWAS risk genes (through MAGMA).

LieberInstitute / zandiHyde_bipolar_rnaseq

Microglia and DE genes #10