virajbdeshpande / AmpliconArchitect

AmpliconArchitect (AA) is a tool to identify one or more connected genomic regions which have simultaneous copy number amplification and elucidates the architecture of the amplicon. In the current version, AA takes as input next generation sequencing reads (paired-end Illumina reads) mapped to the hg19/GRCh37 reference sequence and one or more regions of interest. Please "watch" this repository for improvements in runtime, accuracy and annotations for GRCh38 human reference genome coming up soon.
Other
135 stars 43 forks source link

question of use AA for other species ref #140

Open fanch1122 opened 1 year ago

fanch1122 commented 1 year ago

I am a novice in the use of AA software. I see that the -ref option in the guide includes reference genomes for humans and mice, so I would like to ask if I can use other reference genomes to detect eccDNA of other species using WGS data through AA

jluebeck commented 1 year ago

Hi, at this time only hg19, GRCh37, GRCh38, and mm10 are supported. Providing additional support for references of other species requires the construction of an annotation database. Unfortunately that process is quite complicated and requires multiple different annotation files to be available from the UCSC genome browser and other sites. Not all species are as well-annotated as human and mouse so it may not be feasible for most species.

Thanks, Jens

fanch1122 commented 1 year ago

In fact, I want to use AA to find eccDNA from WGS data of other species , I am thinking that this also requires very complicated annotations like you said, is there a relatively simple or easy way to implement it?

jluebeck commented 1 year ago

Which species do you have in mind? As mentioned above, the process is complicated and involves annotations from many sources. There is not a relatively simple way to do it.

fanch1122 commented 1 year ago

I am currently working on paramecium related

fanch1122 commented 1 year ago

I see AA's use method on the required data for WGS sequencing fastq files and genome fasta sequences for that species, is that true? If I don't need to annotate eccDNA specifically, can I use AA to do it?

jluebeck commented 1 year ago

The AA genome annotations are used for marking low complexity, repetitive regions, low-mappability regions, oncogenes, as well as areas that show high signal across many "normal" samples.

You would need to collect or generate analogous files for what is listed in the AA data repo. You will encounter many errors if you try to leave these kinds of files out of the analysis.

Thanks, Jens