nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
351 stars 386 forks source link

Support for UMI FASTQ #871

Open adamrtalbot opened 1 year ago

adamrtalbot commented 1 year ago

Description of feature

UMI data is sometimes stored in a third FASTQ file, typically because the UMI is embedded in the index and bcl2fastq2 cannot combine it into forward/reverse. This produces three files: R1: Forward R2: UMI R3: Reverse

We can support UMI processing using this method. Key points:

See https://github.com/nf-core/fastquorum/pull/11 for an example implementation. Could be implemented as part of NF-Core subworkflow.

FriederikeHanssen commented 1 year ago

Nice, yes I think this is already in the planning as soon as we have the shared subworkflow. ( I think there is a PR in modules repo somewhere, we should cross check it includes this)