tavareshugo / tutorial_DESeq2_contrasts

54 stars 14 forks source link

slide 8 nested contrast not always able to be done with named coefficients #2

Open turkeyri opened 2 years ago

turkeyri commented 2 years ago

I have really enjoyed your tutorial and the effort you made to break it down in steps, layering on complexity and giving concrete examples, derived from different equivalent perspectives - I finally understand.

However, I noticed you recently fixed the order in the contrast statement for 'Three Factors with nesting (Slide 8)' for named coefficients and corrected it to use listValues but I think it would be informative to also mention that not all contrasts can be represented as a list of named coefficients within DESeq2. Specifically, as I understand it, the syntax for contrasts requires a length two list of vector for 1) which (positive) coefficients are to be in the numerator, and 2) which (negative) coefficients go in the denominator, so that the default listValues = c(1, -1) enforces the positive and negative signs of those contrast list of coefficients in the equivalent numeric contrast representation. Note this implies that all named coefficients in the numerator have the same multiplier (element 1 of listValues) and all named coefficients in the denominator have the same multiplier (element 2 of listValues), moreover the latter is required to be negative as per the documentation. So that works in the example pink_shade - white_shade that you fixed 12/13/2021 since the contrast vector c(0,-0.5, 0.5, 0.5, 0, 0, 0, 0) has the numerator set 0.5{species_C_vs_A, species_D_vs_A} and denominator set -0.5{species_B_vs_A}, so it is easy to specify listValues = c(0.5, -0.5) for the example. Same for the other two examples. Given that you then have 3 examples with equivalent calls using either numeric contrast and lists of named coefficient, (all using non-default listValues), it might leave the impression to the inexperienced reader like me that all contrast vectors can be represented both ways.

However, for the contrast mentioned before that pink_sun - pink_shade the contrast vector is c(0, 0, 0, 0, 1, 0, 0.5, 0.5) which makes the numerator set {condition_sun_vs_shade, speciesC.conditionsun, speciesD.conditionsun} but the multiplier for the numerator should be a heterogeneous c(1,0.5,0.5), which is not allowed by the DESeq2 syntax, i.e. you cannot specify listValues=list(c(1,0.5,0.5), c(-1)). One could be tempted to write res2_pink_shade_vs_sun <- results(dds, contrast = list(c("condition_sun_vs_shade", "speciesC.conditionsun", "speciesD.conditionsun"))) but that incorrectly computes log2 fold change (MLE): condition_sun_vs_shade+speciesC.conditionsun+speciesD.conditionsun effect

It might be instructive to put a comment that this contrast cannot be represented via DESeq2 contrast list syntax using listValues currently, but can be represented with the numerical vector syntax.

Thanks again for an excellent tutorial.