Option to keep chimeric reads after UMI deduplication

While reviewing #1369, I noticed that we have set the parameter --chimeric-pairs=discard for umi-tools and wondered if that is actually a good default choice. I planned to briefly discuss that in the #rnaseq_dev Slack channel, but since it is now an official issue, we can also track it here :-)

Purely from a biological view, particularly the transcriptome alignments may comprise a significant amount of chimeric read pairs, simply because of an unannotated splice variant or because of an antisense long non-coding RNA spanning several annotated transcripts. Also, many users use the pipeline on cancer data, where fusion genes or chromosomal rearrangements are to be expected.

However, I have in the meantime read in the UMI-tools FAQ that disabling the option significantly increases the memory demands, so the computational complexity clearly argues for disregarding this complexity by default and leave it to the users of the pipeline to look at chimeric transcripts specifically, if of interest.

nf-core / rnaseq

Option to keep chimeric reads after UMI deduplication #1373

Description of feature