Bad practice recommendation: htseq

Some tool choices in RNAseq are a matter preference, in that the evidence is not clear enough to enable an empirically or theoretically motivated choice. However, some things are just demonstrably bad practice. Quantifying expression at the gene level by counting reads mapping to a gene (e.g. using htseq-count) is an example of bad practice.

There is a substantial literature showing that it's important to quantify at the isoform level, and that gene-level quantification is at best misleading. It's also theoretically bad. See fig1b of Trapnell et al. 2013 for an illustration of just one reason.

Why not recommend a best-practice pipeline that holds across model and de-novo assembled references? Something like:

align reads to FASTA file of transcripts with bowtie2 (no need for spliced alignment)
assign multi-mapping reads and quantify using eXpress
perform DE on transcript estimated counts using whatever method

ngs-docs / 2014-msu-rnaseq

Bad practice recommendation: htseq #1