AmpliconSuite / AmpliconSuite-pipeline

A quickstart tool for AmpliconArchitect. Performs all preliminary steps (alignment, CNV calling, seed interval detection) required prior to running AmpliconArchitect. Previously called PrepareAA.
Other
58 stars 28 forks source link

Handling BAMs aligned with BWA-MEM -m flag #44

Closed tjbencomo closed 1 year ago

tjbencomo commented 1 year ago

Hi - I have tumor BAMs that have already been fully processed, including reference genome alignment with bwa mem using the -m flag. I saw in the documentation that reads should be aligned without the -m flag.

Would there be any issue if I used Picard's SamToFastq to convert the BAM to a FASTQ and then used the FASTQ as input to the pipeline? I can access the original FASTQs if needed but it would be easier to use the BAM.

Thanks

jluebeck commented 1 year ago

Hi Tomas, thanks for this question - indeed the reads will need to be realigned and you can provide the two fastq files as input to the pipeline. For unmapping bams, we provide this script for users who might need it, which uses samtools to do the unmapping. This method is supposedly faster than Picard's SamToFastq but I have not done any benchmarking myself - just an option to consider.

I recommend two stages for this workflow when running from fastqs:

Stage 1: Run AmpliconSuite-pipeline.py for alignment and CNV seed generation (multithreaded as needed), but do not set --run_AA --run_AC Stage 2: Run AmpliconSuite-pipeline.py again but now for focal amplification identification (--run_AA --run_AC set) using the bam file and [sample]_AA_CNV_SEEDS.bed file produced from stage 1. This stage only uses a single thread.

Thanks, Jens

tjbencomo commented 1 year ago

Hi Jens - thanks for the quick response! I will try that script instead of using Picard and then follow the 2 step process you suggested. Thanks!