nunofonseca / irap

integrated RNA-seq Analysis Pipeline
GNU General Public License v3.0
82 stars 33 forks source link

Fix duplicate exon error for featureCounts #88

Closed pinin4fjords closed 5 years ago

pinin4fjords commented 6 years ago

This fixes a major bug in using featureCounts to count exons. GTF files frequently have the same exon listed multiple times in different transcripts, and current behaviour has two problems:

This PR uses -O to allow assignment of reads to multiple exon entries, and then incorporates a sorting and de-duplication step when processing the output such that a single entry per exon is retained.

The side-effect is that all featureCounts outputs are now sorted by feature or meta-feature identifier. I can't see why this would be a problem, but @nunofonseca would know better than me on that.