AmpliconSuite / AmpliconClassifier

Classify output of AmpliconArchitect to detect types of focal amplifications present
BSD 2-Clause "Simplified" License
16 stars 11 forks source link

About 'Complex non-cyclic' #4

Open FromSoSimple opened 3 years ago

FromSoSimple commented 3 years ago

Hi,

Thanks very much for sharing this tool to parse AmpliconArchitect (AA) results. My original understanding about AA's *cycles.txt file is that if a cycle include segment 0 at the start and at the end, then it's either linear amp or complex rearrangement; and if a cycle does not include segment 0, then it's cyclic: the last segment loops back to connect to the first segment.

But when I ran AmpliconClassifier (AC) on one of my AA outputs (amplicon1_cycle.txt file pasted below), where It's clear that Cycle 1, 2, 3, 4, 5 do not have segment 0, AC classified this amplicon as 'Complex non-cyclic'. I wonder why it's not classified as 'Cyclic'. Thank you!

Interval 1 X 127918918 127941545 Interval 2 X 128850387 129069878 Interval 3 X 131390129 131458996 Interval 4 X 134226913 134266980 Interval 5 X 135460501 135522871 Interval 6 X 142667656 142690042 List of cycle segments Segment 1 X 127918918 127921210 Segment 2 X 127918918 127929347 Segment 3 X 127918918 127941545 Segment 4 X 127921117 127921210 Segment 5 X 127921117 127921296 Segment 6 X 127921151 127921210 Segment 7 X 128850387 128870947 Segment 8 X 128870383 128870983 Segment 9 X 128870383 128870985 Segment 10 X 128870433 128870697 Segment 11 X 128870433 128870826 Segment 12 X 128870433 128870844 Segment 13 X 128870433 128870947 Segment 14 X 128870433 128871516 Segment 15 X 128870443 128870682 Segment 16 X 128870443 128870947 Segment 17 X 128870474 128870826 Segment 18 X 128871077 128871423 Segment 19 X 128871077 128871441 Segment 20 X 128871077 128871490 Segment 21 X 128871077 128871497 Segment 22 X 128871077 128871545 Segment 23 X 128871178 128871423 Segment 24 X 128871178 128871490 Segment 25 X 128871178 128871566 Segment 26 X 128871178 128871920 Segment 27 X 128871210 128871490 Segment 28 X 128871210 128871516 Segment 29 X 128871210 129069878 Segment 30 X 128871321 128871516 Segment 31 X 128871397 128871545 Segment 32 X 128871780 129069878 Segment 33 X 131390129 131438835 Segment 34 X 131413108 131448398 Segment 35 X 131438260 131438835 Segment 36 X 131438310 131438436 Segment 37 X 131438310 131438835 Segment 38 X 131438434 131438436 Segment 39 X 131438434 131438850 Segment 40 X 131438521 131448398 Segment 41 X 131438541 131438683 Segment 42 X 131438541 131438835 Segment 43 X 131438568 131438683 Segment 44 X 131447145 131448398 Segment 45 X 131452790 131456186 Segment 46 X 131452790 131458996 Segment 47 X 134226913 134247009 Segment 48 X 134233109 134233361 Segment 49 X 134246839 134247009 Segment 50 X 134246854 134246980 Segment 51 X 134246854 134266980 Segment 52 X 134246913 134247009 Segment 53 X 135460501 135522871 Segment 54 X 135480390 135503158 Segment 55 X 135502443 135502601 Segment 56 X 142667656 142690042 Segment 57 X 142689572 142690042 Cycle=1;Copy_count=10.7116740972;Segments=9+,5- Cycle=2;Copy_count=7.13433873533;Segments=37+,25+,17+ Cycle=3;Copy_count=4.3826755996;Segments=39+,15- Cycle=4;Copy_count=2.3093359841;Segments=12+,19- Cycle=5;Copy_count=2.21980533822;Segments=13+,13-,21+,6+,20- Cycle=6;Copy_count=2.19718248634;Segments=0+,56-,0- Cycle=7;Copy_count=2.09796638102;Segments=8+,49+,27+,4- Cycle=8;Copy_count=2.03240257232;Segments=26+,35+ Cycle=9;Copy_count=1.98836332533;Segments=0+,33+,24+,1-,0- Cycle=10;Copy_count=1.14207766849;Segments=0+,3-,0- Cycle=11;Copy_count=1.06159424616;Segments=0+,53-,0- Cycle=12;Copy_count=1.01079937939;Segments=0+,29-,47-,0- Cycle=13;Copy_count=0.951940523565;Segments=0+,51-,38-,16+,7-,0- Cycle=14;Copy_count=0.677046785145;Segments=42+,23+ Cycle=15;Copy_count=0.550129044225;Segments=34+,45+ Cycle=16;Copy_count=0.526777056708;Segments=54+,14-,22+ Cycle=17;Copy_count=0.503671349318;Segments=0+,46-,44-,48+,32+,0- Cycle=18;Copy_count=0.405645559019;Segments=54+,28-,52-,31+ Cycle=19;Copy_count=0.309049994232;Segments=0+,2+,43-,55+,50-,36-,11-,18+,41+,2-,0- Cycle=20;Copy_count=0.250119950562;Segments=54+,30-,10-,22+ Cycle=21;Copy_count=0.249063483838;Segments=0+,46-,40-,57+,0-

FromSoSimple commented 3 years ago

Looks like their lengths might be too short?

jluebeck commented 3 years ago

Hi,

Sorry for the delay in responding. The classifier by default expects the cycles which comprise the ecDNA will be at least 10 kbp on each cyclic path. At present, we do not attempt to make determinations about the origins of cyclic decompositions which are significantly smaller than that.

Best, Jens

FromSoSimple commented 3 years ago

Thanks, Jens!

FromSoSimple commented 3 years ago

Hi Jens,

I have another case with following AA results:

Interval 1 2 15662961 16718671 List of cycle segments Segment 1 2 15762954 15982958 Segment 2 2 15762954 16271651 Segment 3 2 15984443 16108827 Segment 4 2 15984443 16120775 Segment 5 2 16109441 16120775 Segment 6 2 16123800 16271651 Segment 7 2 16273865 16282960 Segment 8 2 16332961 16402960 Segment 9 2 16332961 16618682 Segment 10 2 16422961 16484159 Cycle=1;Copy_count=129.66200028;Segments=0+,7-,2-,9-,0- Cycle=2;Copy_count=70.7033566593;Segments=0+,7-,6-,5+,3+,1-,9-,0- Cycle=3;Copy_count=18.1865904689;Segments=4+ Cycle=4;Copy_count=17.1845928766;Segments=0+,8-,0- Cycle=5;Copy_count=12.0828255791;Segments=0+,10-,0-

To me Cycle 3 is cyclic according to AA manual, but AC classifies this case as Complex non-cyclic. Any idea why?

jluebeck commented 3 years ago

Hi,

The decompositions produced by AA are not classifications of the amplicon into a biological class of amplification. They represent a more abstract decomposition of the breakpoint graph structure. The presence of cyclic graph decompositions is indeed associated with ecDNA. However, cycles in the graph can arise by different mechanisms, such as BFB. You can see the more biological classifications in the AC output columns which indicate ecDNA+ or BFB+.

In this case, it appears the vast majority of the amplicon CN is explained in non-cyclic structures. Thus, it is likely AC does not have sufficient evidence to bioinformatically call an ecDNA in this case. Due to the rearranged profile of the segments, it then applies the "complex non-cyclic" amplicon classification. Note that AC having insufficient evidence to call an ecDNA does not imply the sample is certainly negative for ecDNA.

Best, Jens

FromSoSimple commented 3 years ago

Got it. Thanks very much, Jens.