10XGenomics / cellranger

10x Genomics Single Cell Analysis
https://www.10xgenomics.com/support/software/cell-ranger
Other
342 stars 91 forks source link

Intercepting non optical duplicate from BAM file #141

Closed castaway1990 closed 2 years ago

castaway1990 commented 2 years ago

Hello everyone,

I'm trying to work with Cellranger count aligned BAM files. I need to extract confidently mapped reads and avoid amplification products. I can find the flags used for optical and non optical duplicates "1024" according to https://broadinstitute.github.io/picard/explain-flags.html in the *_possorted_bam.bam, now I'm wondering if that flag is added by Cellranger itself by taking advantage of UMIs or it is a standard flag produced by STAR. If that's not the case how should I parse the BAM file for that information?

Thank you! Davide

evolvedmicrobe commented 2 years ago

If you want one read per confidently mapped read for each UMI that generated a count in the feature barcode matrix (so was not only confidently mapped, but was confidently mapped to the transcriptome), I'd select reads with the 8 value set in thexf Aux tag described here.

Both cellranger count and cellranger atac set the duplicate flag rather than it being produced by STAR

castaway1990 commented 2 years ago

Perfect. Apparently all reads in B-Cell lymphoma multiome dataset (10x website ) are reporting xf:0 tag in that field. I will investigate a little bit more, meanwhile relying on 1024 tag will do the trick.

Thank you! Davide