mozack / abra

Assembly Based ReAligner
MIT License
70 stars 12 forks source link

Abra will realign duplicates? #39

Closed ionox0 closed 6 years ago

ionox0 commented 6 years ago

I just had a question regarding release 0.92

Will Abra realign duplicates or ignore these reads? Thank you for the help

mozack commented 6 years ago

Duplicate reads are not eligible for assembly and do not contribute to contig generation. However, they are realigned when appropriate.

ionox0 commented 6 years ago

Thank you very much for the information 👍

ionox0 commented 6 years ago

By "where appropriate", that means when the duplicate reads map more closely to the new reference than to the old reference, correct?

We are exploring removing the MarkDuplicates step from our pipeline, and were wondering if this would cause issues with the runtime + results of Abra. We are running a comparison and can get back with the results.

mozack commented 6 years ago

"By "where appropriate", that means when the duplicate reads map more closely to the new reference than to the old reference, correct?"

Yes, that's correct. After assembly, the duplicate reads are realigned the same way as the non-duplicates.

Will be very interested in seeing your results, thanks!

ionox0 commented 6 years ago

Just to circle back on this, we ran our pipeline with and without MarkDuplicates step removed, and noticed and slight change in alt allele frequency, although because we are working with UMIs there are some complicating factors that may have contributed to this discrepancy. For now we'll be leaving in the MarkDuplicates step.