virajbdeshpande / AmpliconArchitect

AmpliconArchitect (AA) is a tool to identify one or more connected genomic regions which have simultaneous copy number amplification and elucidates the architecture of the amplicon. In the current version, AA takes as input next generation sequencing reads (paired-end Illumina reads) mapped to the hg19/GRCh37 reference sequence and one or more regions of interest. Please "watch" this repository for improvements in runtime, accuracy and annotations for GRCh38 human reference genome coming up soon.
Other
131 stars 41 forks source link

# of discordant reads #133

Open hj6-sanger opened 1 year ago

hj6-sanger commented 1 year ago

Dear Staff.

I appreciate you for developing and maintaining this wonderful tool.

I have a quick question.

The number of discordant reads in _edges_cnseg.txt (I know the number is double counted) is different from the number in graph.txt.

Some discordant reads in _edges_cnseg.txt seem to disappear in graph.txt.

Could you tell me why? (I guess discordant reads that are not supported by CNVs are removed?)

Thanks again for developing this wonderful tool.

BW

jluebeck commented 1 year ago

Hi BW,

This is a very good question, and @virajbdeshpande may better be able to answer, but I believe it occurs when the edge connects to something outside the focally amplified areas detected by AA. These may be biologically interesting (e.g. possible HSR integration points), but they may also be possible false-positive SVs that are essentially filtered from the graph file. The other, and in my opinion, less likely alternative is that the region they join to was mistakenly not detected as amplified by AA and was thus not included in the focal amplification.

Thanks, Jens