czbiohub-sf / MIDAS

Metagenomic Intra-Species Diversity Analysis (MIDAS)
MIT License
35 stars 10 forks source link

bwa-meme #106

Closed nick-youngblut closed 1 year ago

nick-youngblut commented 1 year ago

From the MIDAS2 paper:

with database customization and Bowtie2 alignment taking up to 75% of run time

Given the apparent computational bottleneck of bowtie2, are there any plans to add bwa-meme as an alternative aligner?

zhaoc1 commented 1 year ago

Thanks for the suggestion. How doew bwa-meme compute the alignment quality score, e.g in terms of multi-alignment and mis-alignment?

nick-youngblut commented 1 year ago

I believe that bwa-meme is quite similar to bwa-mem in regards to the quality score (basically, a drop-in replacement for bwa-mem or bwa-mem2)

zhaoc1 commented 1 year ago

I see. Thanks for the reply. We have a simulation-based in-depth benchmarking/analysis for the Bowtie2-based metagenotyping (https://www.biorxiv.org/content/10.1101/2022.06.30.498336v1). We also relied on this blog as a starting point of how bowtie2 assign MAPQ scores (http://biofinysics.blogspot.com/2014/05/how-does-bowtie2-assign-mapq-scores.html).

bwa-meme does seem like a compelling alignment tool for its reported speed. However, I don't have the bandwidth to benchmarking it for the metagenotyping results in the near future. I will put it into my TODO list.

Thanks for the suggestion.

nick-youngblut commented 1 year ago

IT's too bad that BWA-MEME came out so recently. Using BWA-MEME instead of bowtie2 likely would have substantially increased the performance of MIDAS2

nick-youngblut commented 1 year ago

@zhaoc1 I just wanted to check in and see if you found time to test out BWA-MEME. Did it substantially help improve the computational efficiency?

zhaoc1 commented 1 year ago

Hi @nick-youngblut,

No I haven't. As a matter of fact, I am more concerned about the cross-mapping issue, especially using MAGs as the template genomes. E.g., mobile elements are prone to be mis-assembled in the first place, and later on prone to cross-mapping. In other words, my current research interest focus on mostly on the accuracy than efficiency of mapping. Sorry couldn't be much helpful for testing out BWA-MEME.

nick-youngblut commented 1 year ago

Thanks for letting me know!

especially using MAGs as the template genomes ... prone to be mis-assembled in the first place

You might be interested in ResMico for identifying misassemblies. The peer reviewed version of the paper should be published very soon.

zhaoc1 commented 1 year ago

Nice. Thank you! Reading it now.

And I will gonna close this issue.