mozack / abra

Assembly Based ReAligner
MIT License
71 stars 12 forks source link

does abra realign unaligned reads #13

Open dwaggott opened 9 years ago

dwaggott commented 9 years ago

Very cool project!

Does the code currently (a) try and salvage unaligned reads and (b) distant reads with potential alternate alignments. This would help for the events I'm looking for. i.e. long indels which initially don't align and variants in highly homologous ion channels.

mozack commented 9 years ago

ABRA uses localized assembly and then globally realigns reads, so yes a) unaligned reads are salvaged and b) reads may be moved to locations distant from the original mapping.

Caveat 1: There must be enough reads mapped close to the correct location to allow the local assembly to uncover the variant in the first place. I've found that bwa mem generally does a good job of doing this.

Caveat 2: Regarding homologous regions, the realignment is fairly conservative. We do not include ambiguously mapped reads in the local assembly at present and discard assembled contigs that have low mapping quality scores.

dwaggott commented 9 years ago

Thanks, I'll test and see and provide feedback.

On Fri, Jan 9, 2015 at 11:29 AM, Lisle Mose notifications@github.com wrote:

ABRA uses localized assembly and then globally realigns reads, so yes a) unaligned reads are salvaged and b) reads may be moved to locations distant from the original mapping.

Caveat 1: There must be enough reads mapped close to the correct location to allow the local assembly to uncover the variant in the first place. I've found that bwa mem generally does a good job of doing this.

Caveat 2: Regarding homologous regions, the realignment is fairly conservative. We do not include ambiguously mapped reads in the local assembly at present and discard assembled contigs that have low mapping quality scores.

— Reply to this email directly or view it on GitHub https://github.com/mozack/abra/issues/13#issuecomment-69385462.

dwaggott commented 9 years ago

I get the following message when trying to use --aur option Assemble unaligned reads (currently disabled).

Maybe it's from here?

ABRA v0.86

src/main/java/abra/ReAlignerOptions.java

        if (shouldReprocessUnaligned) {
//          processUnaligned();
        }
mozack commented 9 years ago

Sorry - that option is stale and is no longer supported.

With older versions of bwa, we found a benefit to assembling all unaligned reads together and mapping the resulting contigs to the genome. The impact was biggest for single end reads.

With recent versions of bwa mem, we've found this to no longer be necessary.

dwaggott commented 9 years ago

Hmm, based on simulation, the features we are looking at are specifically found incorrectly in pseudogenes or unaligned reads. Would you have any recommendations on how I might go about adding back this functionality to your code?

mozack commented 9 years ago

Can you tell me a little more about the features you're looking for? Do you have a sample dataset you could share?

Earlier implementations assembled the unaligned reads together, mapped contigs to the reference, mapped reads to the contigs and assigned a provisional location to the reads for inclusion in the standard localized assembly. Will need to do some digging to get a feel for what re-enabling this will take.