genome / docker-dna-alignment

A fat docker image for running alignment
2 stars 3 forks source link

Add UMI option to markduplicates_helper.sh #5

Open jasonwalker80 opened 7 years ago

jasonwalker80 commented 7 years ago

Ideally this would be an optional input to markduplicates_helper.sh. Default behavior would be undefined. For UMI data, the tag is used to mark duplicates within a single molecule family. Example: BARCODE_TAG=RX

ebelter commented 7 years ago

10X uses the RX tag as the raw index, and the BX tag as the 'error-corrected and confirmed against a list of known-good barcode sequences. Use this for analysis.'

jasonwalker80 commented 7 years ago

Sure, the tag is configurable. I was using the tag from the UMI data I was working on recently. Since this is to process data outside of the longranger/10X software, you can use any tag you'd like. It should be a configurable input.

There has been much consternation on this issue: https://github.com/samtools/hts-specs/pull/200 https://github.com/samtools/hts-specs/pull/238