Closed FelixMoelder closed 1 year ago
We should discuss how this is joined with the separate umi fastq case. How is that configured now?
This still works. In case we have a separate fastq file with umis one can just define that file and set the read structure to +M
which defines the whole sequence in the fastq being the UMI.
We should discuss how this is joined with the separate umi fastq case. How is that configured now?
This still works. In case we have a separate fastq file with umis one can just define that file and set the read structure to
+M
which defines the whole sequence in the fastq being the UMI.
Can you update config/README.md to describe all ways to configure UMIs please?
Until now UMIs where only supported by adding a fastq file containing the UMI of each read. Often UMIs do not exists as separate fastq records but as part of the read sequences. To handle UMIs properly information about them is now stored in two additional columns in the samplesheet.
umi_read
: Defines whether UMIs are part of records in fq1 or fq2.umi_read_structure
: The template of the read defining the position of UMIs in records (see https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures)Handling UMIs is optional. In case the
umi_read
column is missing or left empty UMIs will not be annotated for duplicate marking or consensus read calculation.