AmpliconSuite / AmpliconSuite-pipeline

A quickstart tool for AmpliconArchitect. Performs all preliminary steps (alignment, CNV calling, seed interval detection) required prior to running AmpliconArchitect. Previously called PrepareAA.
Other
48 stars 25 forks source link

Suggestions for arguments to AA to tweak? #39

Open tischfis opened 1 year ago

tischfis commented 1 year ago

I have a few WGS samples (matched T/N) at 80X with relatively high purity and have matched FISH showing ecDNA with the target gene. I ran amplicon architect suite with suggested inputs and the output is not suggesting ecDNA.

For sample A, it identifies the gene in the amplicon and contained in the cycle with the highest CN, however that cycle is not predicted to be cyclical and AA_classifier says it's "complex non -cyclic". For that amplicon, none of the cycles are predicted to be circular.

For sample B, it identifies the gene in the amplicon (not with the highest CN) and again that cycle is not predicted to be circular, however AA_classifier says the amplicon is ecDNA. I understand its possible for high CN cycles to be ecDNA in origin. In this case the CN is 7.5 (the CN for the highest cycle is 16).

I am wondering if there any suggestions for arguments to AA that I might play with to see if the output changes. or any suggestions appreciated. thanks!

jluebeck commented 1 year ago

Hi,

Thanks for checking in with this feedback and questions.

Sample A: Hard to say without seeing the outputs, but this may be a case where the short read data does not identify all of the necessary SVs to in fact admit a circular genome structure.

Sample B: The cycles reported by AA are decompositions of the the genome graph in a way that best explains the changes in copy number. While these paths contain the signatures of ecDNA, they are not necessarily complete reconstructions of the ecDNAs, particularly if an SV is missed by the short reads. Thus, it is very possible for a non-cyclic path overlapping the ecDNA region to be assigned a higher copy number than the ecDNA itself.

Before recommending additional parameters, it would be helpful if you are able to share the commands you used to run AmpliconSuite-pipeline. If it was done completely with default parameters, you could certainly try a run with--downsample 40 or --downsample 50 to see if there is any improvement in CN segmentation or SV detection (keep in mind that the threshold for calling an SV scales with baseline coverage, so it may not change things that much). Raising --downsample may increase runtime a bit, however.

If you would like me to take a look at any of these output files, please feel free to email me at jluebeck [at] ucsd. edu.

Thanks, Jens