biod / sambamba

Tools for working with SAM/BAM data
http://thebird.nl/blog/D_Dragon.html
GNU General Public License v2.0
557 stars 104 forks source link

markduplicate's result still differs from picard #461

Closed xiucz closed 3 years ago

xiucz commented 3 years ago

Hi, I have seen the issues about sambamba markdup, they get consensus result as picard did. But in my situation, I use samtools flagstat command to check my results, I get different results:

#version 0.8.0
#sample1 using sambamba
112077088 + 0 in total (QC-passed reads + QC-failed reads)
70920 + 0 secondary
0 + 0 supplementary
29370719 + 0 duplicates
112048325 + 0 mapped (99.97% : N/A)
112006168 + 0 paired in sequencing
56003084 + 0 read1
56003084 + 0 read2
111455934 + 0 properly paired (99.51% : N/A)
111962358 + 0 with itself and mate mapped
15047 + 0 singletons (0.01% : N/A)
465646 + 0 with mate mapped to a different chr
418127 + 0 with mate mapped to a different chr (mapQ>=5)

#sample1 using picard
112077088 + 0 in total (QC-passed reads + QC-failed reads)
70920 + 0 secondary
0 + 0 supplementary
29393253 + 0 duplicates
112048325 + 0 mapped (99.97% : N/A)
112006168 + 0 paired in sequencing
56003084 + 0 read1
56003084 + 0 read2
111455934 + 0 properly paired (99.51% : N/A)
111962358 + 0 with itself and mate mapped
15047 + 0 singletons (0.01% : N/A)
465646 + 0 with mate mapped to a different chr
418127 + 0 with mate mapped to a different chr (mapQ>=5)
#sample2 using  sambamba
84808248 + 0 in total (QC-passed reads + QC-failed reads)
48994 + 0 secondary
0 + 0 supplementary
21292762 + 0 duplicates
84784648 + 0 mapped (99.97% : N/A)
84759254 + 0 paired in sequencing
42379627 + 0 read1
42379627 + 0 read2
84349076 + 0 properly paired (99.52% : N/A)
84723392 + 0 with itself and mate mapped
12262 + 0 singletons (0.01% : N/A)
346756 + 0 with mate mapped to a different chr
312073 + 0 with mate mapped to a different chr (mapQ>=5)

#sample2 using picard
84808248 + 0 in total (QC-passed reads + QC-failed reads)
48994 + 0 secondary
0 + 0 supplementary
21308634 + 0 duplicates
84784648 + 0 mapped (99.97% : N/A)
84759254 + 0 paired in sequencing
42379627 + 0 read1
42379627 + 0 read2
84349076 + 0 properly paired (99.52% : N/A)
84723392 + 0 with itself and mate mapped
12262 + 0 singletons (0.01% : N/A)
346756 + 0 with mate mapped to a different chr
312073 + 0 with mate mapped to a different chr (mapQ>=5)

148 gives the same result, but in practice, it dosen't.Is it a bug?

Thank you. xiucz.

pjotrp commented 3 years ago

Well, tools change. Picard probably got changed in the last years. The difference is small and I would not worry about it. There are always edge cases.