vanheeringen-lab / seq2science

Automated and customizable preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows. Works equally easy with public as local data.
https://vanheeringen-lab.github.io/seq2science
MIT License
153 stars 25 forks source link

Q: [DEG analysis contrast] #1022

Open bioinfolabmu opened 8 months ago

bioinfolabmu commented 8 months ago

Question From your document, it seems that you can only use one factor to do the contrast in differential gene expression analysis. What if we need to combine two factors?

What have I tried

In my "samples.tsv", I have two different developmental stages. Each stage has three different treatments. Each treatment has 3 biological samples.

I know that I can do:

contrasts:

How about contrasts: (stages 4 treatement 1) vs (stages 5 treatment 1) or different combination of both factors?

stages | treatments

4 | Control 4 | Control 4 | Control 4 | Treatment1 4 | Treatment1 4 | Treatment1 4 | Treatment2 4 | Treatment2 4 | Treatment2 4 | Treatment3 4 | Treatment3 4 | Treatment3 5 | Control 5 | Control 5 | Control 5 | Treatment1 5 | Treatment1 5 | Treatment1 5 | Treatment2 5 | Treatment2 5 | Treatment2 5 | Treatment3 5 | Treatment3 5 | Treatment3

Thank you for your attention

siebrenf commented 8 months ago

You can add additional columns to the samples.tsv. Here is your example:

samples stages treatments st
s1 4 treatment1 s4t1
s2 4 treatment1 s4t1
s3 5 treatment1 s5t1
s4 5 treatment1 s5t1
s5 6 treatment1 s6t1
s6 6 treatment1 s6t1

In the config.yaml, the contrast would now be: st_s5t1_s4t1 or st_s4t1_s5t1.

Note: every sample that has a label in the contrast column will be used by DESeq2 to calculate the dispersion in your data. So it is generally best to include them all.