Closed jgolob closed 4 years ago
Hi @jgolob ! We have a pipeline in development specifically for Nanopore long reads that uses minimap2
:
https://github.com/nf-core/nanoseq
In any case, it probably doesn't make sense adding this functionality here because the pipeline is already quite complex and the mapping, QC and downstream processing of long read data tends to be quite different.
Anyway, please take a look and feel free to join the #nanoseq channel on the nf-core Slack workspace if you have any questions. https://nf-co.re/join
I'll close this in favour of opening issues on #nanoseq :+1:
Opportunity
Long-read sequencers have some potential advantages for RNAseq over the more typical illumina short reads. These include:
The current nf-core rnaseq pipeline cannot handle long reads.
Resources
There is an evolving set of tools capable of handing the unique challenges of long reads: 1) [minmap2] (https://github.com/lh3/minimap2) to efficiently align the long reads against a reference genome
2) [TranscriptClean] (https://github.com/dewyman/TranscriptClean) to filter and correct the alignment for common errors introduced in the long-read sequencing tech
Suggestion
Incorporate a module into the nf-core/rnaseq pipeline for handling long-reads sourced from cDNA / raw RNA via
minmap2
andTranscriptClean
, after which the filtered alignments could be processed by the same techniques as other read sources.Perhaps as a new
--long-reads '*.fastq.gz'
command line option.