Open dwaggott opened 9 years ago
ABRA uses localized assembly and then globally realigns reads, so yes a) unaligned reads are salvaged and b) reads may be moved to locations distant from the original mapping.
Caveat 1: There must be enough reads mapped close to the correct location to allow the local assembly to uncover the variant in the first place. I've found that bwa mem generally does a good job of doing this.
Caveat 2: Regarding homologous regions, the realignment is fairly conservative. We do not include ambiguously mapped reads in the local assembly at present and discard assembled contigs that have low mapping quality scores.
Thanks, I'll test and see and provide feedback.
On Fri, Jan 9, 2015 at 11:29 AM, Lisle Mose notifications@github.com wrote:
ABRA uses localized assembly and then globally realigns reads, so yes a) unaligned reads are salvaged and b) reads may be moved to locations distant from the original mapping.
Caveat 1: There must be enough reads mapped close to the correct location to allow the local assembly to uncover the variant in the first place. I've found that bwa mem generally does a good job of doing this.
Caveat 2: Regarding homologous regions, the realignment is fairly conservative. We do not include ambiguously mapped reads in the local assembly at present and discard assembled contigs that have low mapping quality scores.
— Reply to this email directly or view it on GitHub https://github.com/mozack/abra/issues/13#issuecomment-69385462.
I get the following message when trying to use --aur
option Assemble unaligned reads (currently disabled).
Maybe it's from here?
ABRA v0.86
src/main/java/abra/ReAlignerOptions.java
if (shouldReprocessUnaligned) {
// processUnaligned();
}
Sorry - that option is stale and is no longer supported.
With older versions of bwa, we found a benefit to assembling all unaligned reads together and mapping the resulting contigs to the genome. The impact was biggest for single end reads.
With recent versions of bwa mem, we've found this to no longer be necessary.
Hmm, based on simulation, the features we are looking at are specifically found incorrectly in pseudogenes or unaligned reads. Would you have any recommendations on how I might go about adding back this functionality to your code?
Can you tell me a little more about the features you're looking for? Do you have a sample dataset you could share?
Earlier implementations assembled the unaligned reads together, mapped contigs to the reference, mapped reads to the contigs and assigned a provisional location to the reads for inclusion in the standard localized assembly. Will need to do some digging to get a feel for what re-enabling this will take.
Very cool project!
Does the code currently (a) try and salvage unaligned reads and (b) distant reads with potential alternate alignments. This would help for the events I'm looking for. i.e. long indels which initially don't align and variants in highly homologous ion channels.