Closed BrettLiddell closed 1 month ago
Hi @BrettLiddell
I added a new command-line option which implemented your feature request. With the --dedup-barcode-begin-char option, you can specify the character to be used for signaling the beginning of UMI sequence in read names. For example, in your case, " --dedup-barcode-begin-char + " should do the job.
Please be aware that, by default, "+" is used as the character that signals the separation between the two parts of a duplex UMI, so --dedup-barcode-duplex-sep-char should be set according if duplex UMI is used.
Thank you for implementing that as a feature! I'll try it out soon.
Alright, if you have any other question, please let me know.
Hi @BrettLiddell
Do you have any other question for this issue? If not, then I will close this issue soon.
This issue is closed since there has not been any activity related to this issue.
Hello,
To get my reads into the originalName#UMI format in a bam file, I am running: 1) picard FastqToSam 2) fgbio ExtractUMIsFromBam (get reads in originalName#UMI format) 3) picard SamToFastq 4) bwa (for alignment) However, when using fgbio ExtractUMIsFromBam with --annotate-read-names set to true, the UMI tag is appended to the QNAME but with a + instead of a #. Although I can reformat the bams, I was wondering if there were any alternative pre-processing steps or tools that should be used prior to running UVC?