LieberInstitute / goesHyde_mdd_rnaseq

Fernando Goes and Thomas Hyde MDD RNA-seq project
1 stars 0 forks source link

Estimate RNA fractions using Tran et al 2020 data #7

Closed lcolladotor closed 3 years ago

lcolladotor commented 4 years ago

The goal is to estimate the RNA fractions using Tran et al's 2020 data from their pre-print https://www.biorxiv.org/content/10.1101/2020.10.07.329839v1. We will do this separately for sACC and Amygdala.

To do this, we'll use:

  1. minfi as Emily did in the stem cell paper (likely from https://github.com/LieberInstitute/libd_stem_timecourse/tree/master/deconvolution; Matt will point us to the right code)
  2. minfi with the scran statistics that Matt already computed

Louise will work on doing 1. while Matt will help us with 2.

Once we have the cell RNA fractions, we'll use them to check whether they are correlated with qSVs for each brain region separately (I'll detail this more in another issue).

Some notes from Matt's data:

We can use this as the "main" issue for these bullets, but then create smaller issues for doing this work.

If I'm missing anything, please let me know.

Best, Leo

lcolladotor commented 4 years ago

We decided that it's likely easier to try the minfi approach first rather than use https://github.com/xuranw/MuSiC/ as listed in Kayode's paper at https://www.biorxiv.org/content/10.1101/2020.01.19.910976v1. Though we might have to circle back to this in the future.

lahuuki commented 3 years ago

Our ultimate strategy for deconvolution was based on findings from Sosina et al., bioRxiv, 2020. They found the best results with MuSiC algorithm + snRNA seq reference data from same region + filtered marker genes. To filter genes we first filtered for genes with median expression != 0 in the target cell type, then calculated the ratio of the mean expression of the target cell type over mean expression of the next highest non-target cell type. This eliminated noise between expression in our target cell type and any outlying non-target cell types that we observed in the 1vAll findMarker filtering. We selected the top 5 genes per cell types as markers. Using this set of marker genes we ran MuSiC in music_deconvo.R. Results were plotted in deconvo_plots.R.