virajbdeshpande / AmpliconArchitect

AmpliconArchitect (AA) is a tool to identify one or more connected genomic regions which have simultaneous copy number amplification and elucidates the architecture of the amplicon. In the current version, AA takes as input next generation sequencing reads (paired-end Illumina reads) mapped to the hg19/GRCh37 reference sequence and one or more regions of interest. Please "watch" this repository for improvements in runtime, accuracy and annotations for GRCh38 human reference genome coming up soon.
Other
131 stars 41 forks source link

Reproducibility of AmpliconArchitect #126

Open EunchongHuang opened 1 year ago

EunchongHuang commented 1 year ago

Dear AmpliconArchitect Team, I would like to ask you few questions regarding the samples you used in the manuscript. I used same WGS data from BioProject (accession number: PRJNA437014, only KT samples) and ran prepareAA in default mode since I'm less experienced with AA. At last, I obtained the numbers of amplicons and oncogenes amplified within each samples and compared my results with your results from figshare.

I found out that only KT22, KT26, KT31 , KT32, KT33, KT34 had so different results (rest of them resulted with no amplicon nor oncogenes amplified). Can you please tell me why there is such a difference? If it is okay, could you please tell me the actual parameter set when you run the pipeline?

jluebeck commented 1 year ago

Hi,

Can you clarify if it was KT22, KT26, KT31 , KT32, KT33, KT34 which you found to be the same, or if it those samples were different? Perhaps, would you be able to share the output files, particularly log files & stdout (PAA & AA) for a representative example, such as KT11? Would you also be able to share the exact commands you ran? If this is too much information to put into a comment, you can email me at jluebeck [a t ] ucsd.edu

One difference between the 2019 paper and current best practices, is that the CNV tool readDepth was used for CNV seeding in the 2019 paper, and PrepareAA typically uses CNVKit (though users can provide their own CNV calls as well).

If you downloaded BAM files directly from SRA, keep in mind that SRA will strip certain tags from the BAM files, so for best reproducibility, re-alignment of fastq files with bwa mem is somewhat better.

Thanks, Jens