Opportunity

Long-read sequencers have some potential advantages for RNAseq over the more typical illumina short reads. These include:

Superior ability to detect alternative splicing
Quicker experimental cadence / shorter turn around time.
Potentially reduced total experimental costs per sample sequenced.

The current nf-core rnaseq pipeline cannot handle long reads.

Resources

There is an evolving set of tools capable of handing the unique challenges of long reads: 1) [minmap2] (https://github.com/lh3/minimap2) to efficiently align the long reads against a reference genome

2) [TranscriptClean] (https://github.com/dewyman/TranscriptClean) to filter and correct the alignment for common errors introduced in the long-read sequencing tech

Suggestion

Incorporate a module into the nf-core/rnaseq pipeline for handling long-reads sourced from cDNA / raw RNA via minmap2 and TranscriptClean, after which the filtered alignments could be processed by the same techniques as other read sources.

Perhaps as a new --long-reads '*.fastq.gz' command line option.

nf-core / rnaseq

Support for long-reads (e.g. minion / pacbio) with minmap2 #380

Opportunity

Resources

Suggestion