Closed gdolsten closed 1 year ago
I am not sure it has ever been published, one can observe a slight enrichment of pairs within ~3bp of each other, suggesting some steps of the library prep might add or remove 1-3 bp at the ends of fragments. Not sure how widespread it is and how it varies between protocols... That said, it wasn't a huge effect, and the vast majority of duplicates are exact matches, so it's fine to set it to 0.
Ok, great thanks. Pairtools keeps one of the representative duplicates though, correct?
Of course. It keeps the first one it encounters.
Assume this is solved now, feel free to reopen!
Is there a reason why the default parameters are:
Pairs with both sides mapped within this distance (bp) from each other are considered duplicates. [dedup option] [default: 3]?
In particular, is there a reason why deduplication doesn't remove perfectly identical read pairs?