mgatk consensus read formation

JohnApp-scRNA commented 3 years ago

On this page https://github.com/caleblareau/mgatk/wiki/Process-mtDNA-from-CellRanger-ATAC it's stated that "for scRNA, we can utilize UMI-aware PCR deduplication with mgatk". I was wondering how exactly this is done. In particular, how do you form a consensus/representative read for each umi when you perform deduplication. I've seen a variety of different methods for this in the literature.

Thanks, John

caleblareau commented 3 years ago

For the current implementation, we use Picard for deduplication which picks the read with the highest mean base quality— so it’s representative

https://gatk.broadinstitute.org/hc/en-us/articles/360037225972-MarkDuplicates-Picard-

A consensus read would be optimal and I’ve toyed around with this. Unfortunately, I couldn’t get an implementation going with any reasonable efficiency

On Sep 17, 2020, at 7:28 AM, JohnApp-scRNA notifications@github.com<mailto:notifications@github.com> wrote:

On this page https://github.com/caleblareau/mgatk/wiki/Process-mtDNA-from-CellRanger-ATAC it's stated that "for scRNA, we can utilize UMI-aware PCR deduplication with mgatk". I was wondering how exactly this is done. In particular, how do you form a consensus/representative read for each umi when you perform deduplication. I've seen a variety of different methods for this in the literature.

Thanks, John

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/caleblareau/mgatk/issues/27, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD32FYPR6BSA2W4RPKVZTKLSGIMJPANCNFSM4RQQ2PWA.

JohnApp-scRNA commented 3 years ago

Thanks for clarifying :)

caleblareau / mgatk

mgatk consensus read formation #27