Closed ionox0 closed 6 years ago
Duplicate reads are not eligible for assembly and do not contribute to contig generation. However, they are realigned when appropriate.
Thank you very much for the information 👍
By "where appropriate", that means when the duplicate reads map more closely to the new reference than to the old reference, correct?
We are exploring removing the MarkDuplicates step from our pipeline, and were wondering if this would cause issues with the runtime + results of Abra. We are running a comparison and can get back with the results.
"By "where appropriate", that means when the duplicate reads map more closely to the new reference than to the old reference, correct?"
Yes, that's correct. After assembly, the duplicate reads are realigned the same way as the non-duplicates.
Will be very interested in seeing your results, thanks!
Just to circle back on this, we ran our pipeline with and without MarkDuplicates step removed, and noticed and slight change in alt allele frequency, although because we are working with UMIs there are some complicating factors that may have contributed to this discrepancy. For now we'll be leaving in the MarkDuplicates step.
I just had a question regarding release 0.92
Will Abra realign duplicates or ignore these reads? Thank you for the help