Open bkinnersley opened 4 months ago
Hi ben, For your first question, "how we interpret the output files of ATAC-amp, and how this can help prioritise identification of genuine eCDNA amplicons? " For bulk ATAC-seq data in ‘bulk’ mode, there is only one main result from ATACAmp, the ‘.result’ file, which contains the possible eCDNA/hsr forming regions, ordered by score from highest to lowest will be. In single-cell ATAC data in ‘sc’ mode, this file will be slightly different, and in the last line of each possible ecdna/hsr region, there will be the barcode of cells that supports these regions for subsequent analyses at the cell population level. However, the results of the current ATACAmp analysis are very susceptible to the quality of the data, so for your data, I would suggest to do QC before analysing it using high quality reads, what I understand is that there are not many cases of fragments on chrY forming ecDNA, and you can prioritise regions carrying oncogenes and regions larger than 100kb.
About some parameters you mentioned 1, -Mode 0, 1, 2 is on behalf of using different input files to run ATACAmp, ‘0’ mode accept the bam file, ‘1’ mode accept the split reads and discordant reads file, and ‘2’ mode accept the interval file, in order to get the breakpoint information from other software for analysing and saving the time of running after adjusting the parameters. 2, -isize_value is the insert size of the discordant reads, this is related to the sequencing library construction method, but 1000 is a more suitable value for most of the second-generation sequencing methods on the market. 3, --interval_size controls the step size from the breakpoint when calculating the amplified region, 1000 is an empirical parameter, you can also try a larger value to speed up the calculation, or use a smaller value to make the boundaries finer. Finally, at the moment ATACAmp still has limited resolution of single cells and cannot analyse abundance for the time being, but we will continue to build on this software with updates to detect variants in conjunction with new single-cell genome-level sequencing technologies.
Hello,
Thanks for this very useful package!
I just have a few questions after runnning on single-cell ATAC-Seq libraries generated from the 10X multiome kit. I ran using the following commands: python /path/to/software/ATAC-amp/AtacAmp.py \ --bam \
--name \
--isize_value 1000 \
--interval_size 1000 \
--mapq 30 \
--mode 0 \
--type sc \
--gtf /path/to/hg38.ncbiRefSeq.gtf \
--threads 16
I've attached output files from this run in "TEST_output.zip"
I just have a few questions:
Thanks very much
Best wishes
Ben TEST_output.zip