biostars / biostar-handbook

Issue tracker for the Biostar Handbook
57 stars 12 forks source link

edger.r and deseq2.r #154

Closed njbowen closed 3 years ago

njbowen commented 3 years ago

Question:

do the Rscripts edger.r and deseq2.r that you kindly have provided in the biostarhandbook use the gene_exon_length_total in any of the normalization methods? that is, is any kind of RPKM calculated/used?

I may be getting the different Rscripts confused and they may be different based on the alignment based method vs the classification based methods. however, I am assuming they are all using the last 6 columns in a N X M analysis as well as the gene_id column from the count table. As I looked through the Rscript code, I didn't see the use of the gene_exon_length_total or similar value in the count tables used in R in the Rscripts edger.r and deseq2.r that are downloaded from the Biostar Handbook RNA Seq by Example sections.

So should I provide a TPM or and RPKM to your edger.r and deseq2.r Rscripts or supply the raw count values with an exon_length column as well?

Thanks for any insight.

best, Nathan

njbowen commented 3 years ago

After reading more about edgeR and DeSeq2, I realize that they do not use gene length in their normalization methods.

ialbert commented 3 years ago

yes. both RPKM and TPM have fallen out of favor for methods that operate on actual counts rather than after being transformed into what originally seemed more "human" oriented and easier to interpret measures.