PacificBiosciences / pb-CpG-tools

Collection of tools for the analysis of CpG data
BSD 3-Clause Clear License
70 stars 6 forks source link

Need deduplication for aligned BAM ? #19

Closed hmyh1202 closed 2 years ago

hmyh1202 commented 2 years ago

Hello:

I have got primrosed HiFi reads, and aligned to the reference, and should I do deduplication for the aligned BAM ? I found only very few reads were duplicate when I use sambamba. Thank U!

The Best !

dportik commented 2 years ago

Hello @hmyh1202, With HiFi sequencing it is nearly impossible to obtain duplicate reads (there is no PCR, each template is unique). This is likely related to mapping and the duplicate detection tool. We recommend using pbmarkdup to find actual duplicates in HiFi datasets - and this should only happen with amplified libraries. You could re-run with this tool to check your results, but my guess is you can safely ignore the "duplicates" detected with sambamba.