CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
481 stars 190 forks source link

Add FAQ entry re identification of possible duplicates reads/pairs #631

Closed TomSmithCGAT closed 6 months ago

TomSmithCGAT commented 6 months ago

I've added two entries to the FAQ to explain how possible duplicate reads are identified and why the output of dedup can appear to contain reads with the same coordinates. I don't find this easy to explain clearly, so very happy to take suggested edits!

See #555 for an example motivation, though I feel like this has come up a few times in issues.

IanSudbery commented 6 months ago

@TomSmithCGAT Had a bit of a tweak (might have gotten carried away)

TomSmithCGAT commented 6 months ago

Not got carried away at all - I think this is perfect! I had something very similar in mind, but was too lazy to do it!

I'm happy if we merge this now.