RTIInternational / biocloud_docker_tools

Docker library providing a catalog of images for RTI's cloud-based bioinformatics toolkit.
https://hub.docker.com/u/rtibiocloud
5 stars 11 forks source link

Create tximport object with counts from length-scaled TPMs #47

Open bryancquach opened 1 month ago

bryancquach commented 1 month ago

Overview

This issue pertains to merge_salmon_quant/1.10.2/merge_salmon_quant.R. Based on the vignette for tximport and this publication, incorporating average transcript lengths into count normalization can be helpful. DESeq2 does this automatically, but edgeR and limma do not. When limma is the downstream differential expression analysis tool, a list imported with tximport needs to have counts derived from length-scaled TPMs in order to take advantage of average transcript length in the normalization. An additional code snippet should be added to the R script to generate counts that already have average transcript lengths accounted for. This is done when processing Salmon files to produce gene counts:

tximport(..., type = "salmon", countsFromAbundance = "lengthScaledTPM")