Open jeffbhasin opened 6 years ago
Hello Jeff,
the difference between these two tools is purely technical, the output should be equivalent. bammarkduplices2 was designed to work more in memory than on disk.
Best, German
Hello German, Thank you for the information. I ran bammarkduplicates2 and Picard Tools MarkDuplicates on some of the same RNA-seq samples. There were not the same duplicate calls between the two programs. Is this expected? Is there a difference in how biobambam vs Picard call duplicates?
Thanks, Jeff
Hello Jeff,
both bammarkduplicates2 and Picard's MarkDuplicates perform duplicate marking by finding read pairs mapping in the same way to a reference. For a set of pairs mapping in the same way the pair not marked as a duplicate is selected using a score computed by using the base qualities of the reads involved. This score can be identical for some pairs, so any of them could be the "best" one. This leaves room for a divergence between the two tools, so different output is possible for this case. Different output in other cases may be a bug, so if you encounter it, please report it.
Best, German
Hello German, I was looking for documentation about the difference between bammarkduplicates and bammarkduplicates2 and did not see any on the help pages for those respective tools. Is there some description of this difference?
Kind regards, Jeff