hputnam / Becker_E5

3 stars 0 forks source link

DESeq2 results were backwards for up and down regulated #6

Closed hputnam closed 4 months ago

hputnam commented 4 months ago

https://github.com/hputnam/Becker_E5/blob/master/RAnalysis/Scripts/RNA-seq/Host/Host_Differential_Gene_Expression_Analysis.Rmd

Lines 205-207 are the definition and description from the manual

Line 213 has been changed to be in the correct order.

This means we need to rethink the downstream results in terms of the number and functions upregulated and down regulated

AHuffmyer commented 4 months ago

This did not alter the number of DEGs, but changed which were interpreted as up vs down regulated. @daniellembecker let me know if you have any questions.

daniellembecker commented 4 months ago

in my GO enrichment script (https://github.com/hputnam/Becker_E5/blob/master/RAnalysis/Scripts/RNA-seq/Host/Host_GO_Enrichment_Analysis_Func_Annot.Rmd) I make the down and upregulated dataframes on lines 99 -138 so while I had them calculated wrong in number in the DESeq2 script :(, I think we are good for the downstream analysis/functions: #make upreg and downreg data frames DOWNREG <- DEG.res %>% filter(log2FoldChange < 0) UPREG <- DEG.res %>% filter(log2FoldChange > 0)

AHuffmyer commented 4 months ago

The fold chance is calculated by the results() command of the DESeq results object. What is critical is the order in which you list treatments in the "contrasts" command. See the code for explanation and let me know if you have questions!

daniellembecker commented 4 months ago

Thank you, I just looked over the changes and that makes sense.

hputnam commented 4 months ago

Please include the new script information that confirms the fix and the comment of justification/clarification here so we can track everything.

daniellembecker commented 4 months ago

Comments: When running results, the contrast command should be in this order: factor being tested, numerator (the treatment), and denominator (the reference treatment).

Contrast is: "a character vector with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the fold change, and the name of the denominator level for the fold change (simplest case)"

Use resultsNames to show you the model outputs from DESeq2.

Command: resultsNames(DEGSeq2_wald)

Output: "Intercept" "treatment_enriched_vs_control"

On line 213 in https://github.com/hputnam/Becker_E5/blob/master/RAnalysis/Scripts/RNA-seq/Host/Host_Differential_Gene_Expression_Analysis.Rmd, changed to include correct order of treatments:

Original: DEGSeq2.results <- results(DEGSeq2_wald, contrast=c("treatment","control","enriched"))

view DEG analysis results

Updated: DEGSeq2.results <- results(DEGSeq2_wald, contrast=c("treatment","enriched","control"))

view DEG analysis results

hputnam commented 4 months ago

@daniellembecker a good way to sanity check this is also to add code to plot the genes you separate into up and down and make sure that the directionality matches visually with how you categorized them.

daniellembecker commented 4 months ago

thank you, @AHuffmyer added this on line 376-382:

Visualize fold change of DEGs between treatments

plotCounts(DEGSeq2_wald, gene=rownames(up)[1], intgroup="treatment")

plotCounts(DEGSeq2_wald, gene=rownames(down)[1], intgroup="treatment")
daniellembecker commented 4 months ago

Added downreg and upreg columns/rows to DESeq2 dataframe that is output before GO enrichment script and updated script addressing all comments