yerkes-gencore / bulk_template

Template for ENPRC Genomics Core bulk RNAseq projects
1 stars 0 forks source link

Overhauling the limma-edgeR-based workflow based on `p23181_Abbie_Analysis` and `p23224_Matthew_Analysis` #13

Closed micahpf closed 2 months ago

micahpf commented 3 months ago

Overhauling the limma-edgeR-based workflow based on p23181_Abbie_Analysis and p23224_Matthew_Analysis.

The two workflows cover two common scenarios:

  1. The dataset taken from p23181_Abbie_Analysis is a cross-sectional study with two factors, gender and site, plus two batches for each gender x site combination. gender and site are pasted into one factor such that the model formula is ~ gender_site + batch and the contrasts are based on comparing levels of the gender_site factor.

  2. The dataset taken from p23224_Matthew_Analysis is a longitudinal (repeated measures) design where the same subjects are sampled repeatedly on different days (encoded in factor day). Thus, the subjectID should be modeled as random effect to maximize power and interpretability of the model. The subjects are also subdivided into two groups (encoded in factor grp, which the clients wanted to contrast. We again pasted grp and day into gap.day. The model formula was: ~ grp.day and we used limma::durCorrelation (or rather block = subjectID in ourfitVoomLm wrapper) to model subject as a random effect.

All of the changes to functions from the gencoreBulk package are re-defined near the top of the script and in the R/heatmap_functions.R file. These should probably be ported over to the gencoreBulk eventually, if they seem stable and flexible enough.

micahpf commented 2 months ago

Excellent, thanks Kivanc! I'll merge it now.