Open eesiribloom opened 2 weeks ago
Hi, this is a great question. Some degree of interpretation is required still for CoRAL outputs. We have not built up a diverse enough sample set to determine how well AmpliconClassifier would work and what the modifications might be to make it compatible with CoRAL results. This will change as we analyze more samples, but we won't be ready to connect those two tools for some time.
A big risk with calling all genome cycles as ecDNA is that some might be derived from a breakage fusion bridge. If you see a high degree of foldback inversions and reconstructions that contain palindromic series of segments, and step-wise changes in CN then it is more likely to be BFB. Also, genome cycles decomposed with low CN may also not be ecDNA. High CN decompositions that are closed in a head-to-tail fashion are more likely ecDNA. We're happy to help interpret results if needed.
Thanks, Jens
Thank you for your reply. Here is an example of a cycles.txt file from an amplicon. Note: this was generated with default parameters. When I increased --min_bp_support
to 5.0 or 10.0 I saw a significant decrease in paths output - which is expected - but I was no longer able to reconstruct any circular paths, so I went back to default.
I would assume Im looking for circular paths, paths that satisfy constraints (not entirely sure what this means though) and have high copy number and subpaths with high support (is this read support?)
Does 0+
and 0-
indicate a circular path?
For example, based on previous results from AA and decoil, cycle 1 and cycle 6 are of interest and overlap regions previously estimated to be contained within ecDNA, but Im not sure if these results necessarily support that.
Interval 1 chr3 130528734 130728734
Interval 2 chr8 42836869 43036869
Interval 3 chr12 23184602 28059521
Interval 4 chr12 29101551 29301551
Interval 5 chr12 31614458 32309448
Interval 6 chr12 72410272 72665271
Interval 7 chr19 29200032 29946017
Interval 8 chr19 33146201 34601276
Interval 9 chr19 39031534 40328962
Interval 10 chr20 19475071 19900922
Interval 11 chrX 1739217 3193027
List of cycle segments
Segment 1 chr3 130528734 130628733
Segment 2 chr3 130628734 130728734
Segment 3 chr8 42836869 42936868
Segment 4 chr8 42936869 43036869
Segment 5 chr12 23184602 23313736
Segment 6 chr12 23313737 23782045
Segment 7 chr12 23782046 23797060
Segment 8 chr12 23797061 23957745
Segment 9 chr12 23957746 23997864
Segment 10 chr12 23997865 25265269
Segment 11 chr12 25265270 25360453
Segment 12 chr12 25360454 25622905
Segment 13 chr12 25622906 25622918
Segment 14 chr12 25622919 25780115
Segment 15 chr12 25780116 25803970
Segment 16 chr12 25803971 25806645
Segment 17 chr12 25806646 25917588
Segment 18 chr12 25917589 27958246
Segment 19 chr12 27958247 28059521
Segment 20 chr12 29101551 29201551
Segment 21 chr12 29201552 29301551
Segment 22 chr12 31614458 31713292
Segment 23 chr12 31713293 31716434
Segment 24 chr12 31716435 32206875
Segment 25 chr12 32206876 32309448
Segment 26 chr12 72410272 72512816
Segment 27 chr12 72512817 72563536
Segment 28 chr12 72563537 72665271
Segment 29 chr19 29200032 29300031
Segment 30 chr19 29300032 29459277
Segment 31 chr19 29459278 29465259
Segment 32 chr19 29465260 29467530
Segment 33 chr19 29467531 29785292
Segment 34 chr19 29785293 29822449
Segment 35 chr19 29822450 29843120
Segment 36 chr19 29843121 29946017
Segment 37 chr19 33146201 34500466
Segment 38 chr19 34500467 34601276
Segment 39 chr19 39031534 39132453
Segment 40 chr19 39132454 39643471
Segment 41 chr19 39643472 40228962
Segment 42 chr19 40228963 40328962
Segment 43 chr20 19475071 19575070
Segment 44 chr20 19575071 19800051
Segment 45 chr20 19800052 19900922
Segment 46 chrX 1739217 1911898
Segment 47 chrX 1911899 1937067
Segment 48 chrX 1937068 2163657
Segment 49 chrX 2163658 2186393
Segment 50 chrX 2186394 2187092
Segment 51 chrX 2187093 2206100
Segment 52 chrX 2206101 2277483
Segment 53 chrX 2277484 2282342
Segment 54 chrX 2282343 2374471
Segment 55 chrX 2374472 2376727
Segment 56 chrX 2376728 2390794
Segment 57 chrX 2390795 3048708
Segment 58 chrX 3048709 3193027
List of longest subpath constraints
Path constraint 1 2-,13+,14+ Support=17 Satisfied
Path constraint 2 40-,35+,47+ Support=1 Satisfied
Path constraint 3 7-,32-,30- Support=9 Satisfied
Path constraint 4 30+,32+,33+ Support=12 Satisfied
Path constraint 5 49+,51+,52+ Support=2 Satisfied
Path constraint 6 54+,56+,57+ Support=1 Satisfied
Path constraint 7 15+,16+,15- Support=12 Unsatisfied
Path constraint 8 24-,23+,24+ Support=14 Satisfied
Path constraint 9 6+,7+,15+ Support=1 Unsatisfied
Path constraint 10 12+,13+,13+,14+ Support=58 Satisfied
Path constraint 11 6+,7+,8+ Support=2 Satisfied
Path constraint 12 15+,16+,17+ Support=96 Satisfied
Path constraint 13 22+,23+,24+ Support=15 Satisfied
Path constraint 14 49+,50+,51+ Support=8 Satisfied
Path constraint 15 54+,55+,56+ Support=2 Satisfied
Cycle=1;Copy_count=3.8574164389262715;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,18+,19+,0-;Path_constraints_satisfied=10,11,12
Cycle=2;Copy_count=3.8001404858032872;Segments=0+,39+,40+,41+,42+,0-;Path_constraints_satisfied=
Cycle=3;Copy_count=3.165662498442714;Segments=0+,20+,21+,0-;Path_constraints_satisfied=
Cycle=4;Copy_count=3.04413374549155;Segments=0+,1+,2+,0-;Path_constraints_satisfied=
Cycle=5;Copy_count=3.0309324976212757;Segments=0+,22+,23+,24+,25+,0-;Path_constraints_satisfied=13
Cycle=6;Copy_count=2.948191433018197;Segments=0+,29+,30+,32+,33+,34+,35+,47+,34+,35+,47+,48+,49+,51+,52+,54+,56+,57+,58+,0-;Path_constraints_satisfied=4,5
Cycle=7;Copy_count=2.1959669554063765;Segments=0+,26+,27+,28+,0-;Path_constraints_satisfied=
Cycle=8;Copy_count=2.123415783042603;Segments=0+,37+,38+,0-;Path_constraints_satisfied=
Cycle=9;Copy_count=1.970303031811106;Segments=0+,4-,44+,57-,56-,54-,53-,52-,51-,50-,49-,48-,47-,35-,34-,47-,35-,40+,41+,37-,0-;Path_constraints_satisfied=2,6,14
Cycle=10;Copy_count=1.8285167303750316;Segments=0+,43+,44+,57-,56-,54-,52-,51-,49-,48-,47-,46-,0-;Path_constraints_satisfied=
Cycle=11;Copy_count=1.60212559198529;Segments=0+,4-,44+,45+,0-;Path_constraints_satisfied=
Cycle=12;Copy_count=1.1282921215430193;Segments=0+,20+,41+,42+,0-;Path_constraints_satisfied=
Cycle=13;Copy_count=0.9005756417726687;Segments=0+,5+,6+,7+,8+,9+,10+,11+,24-,23+,24+,11-,10-,9-,51-,49-,48-,47-,35-,34-,33-,32-,30-,29-,0-;Path_constraints_satisfied=8
Cycle=14;Copy_count=0.8945466985190293;Segments=0+,29+,30+,32+,33+,34+,35+,36+,0-;Path_constraints_satisfied=
Cycle=15;Copy_count=0.8572647184183353;Segments=0+,20+,41+,37-,0-;Path_constraints_satisfied=
Cycle=16;Copy_count=0.7190071710443533;Segments=0+,43+,44+,57-,56-,55-,54-,53-,52-,51-,49-,48-,47-,46-,0-;Path_constraints_satisfied=
Cycle=17;Copy_count=0.6402548518493347;Segments=0+,2-,13+,13+,14+,15+,16+,17+,18+,10+,11+,24-,23-,24+,11-,10-,9-,8-,7-,6-,27-,26-,0-;Path_constraints_satisfied=1
Cycle=18;Copy_count=0.5216167669073717;Segments=0+,25-,24-,23-,24+,25+,0-;Path_constraints_satisfied=
Cycle=19;Copy_count=0.21732915087771373;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,19+,0-;Path_constraints_satisfied=
Cycle=20;Copy_count=0.17478185137363725;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,10+,11+,24-,23+,24+,11-,10-,9-,8-,7-,32-,30-,56-,55-,54-,53-,52-,51-,50-,49-,48-,47-,35-,34-,47-,35-,34-,33-,32-,31-,30-,29-,0-;Path_constraints_satisfied=3,15
Hi,
Thanks for providing this output file.
The 0+ / 0- notation at the ends of entries in the cycles section indicates what we termed "source" nodes. These connect to the linear genome adjacent to the listed amplified segments. Thus they are not true genome cycles and not directly indicative of ecDNA.
The path constraints listed above the cycles do not report cyclic paths, and do not use the 0+ / 0- notation so that can be a bit confusing.
Because all entries for the decomposed paths in the Cycles section of the file are non-cyclic, I think it is fair to say that CoRAL did not identify ecDNA in this amplicon. It may exist but was not detected here.
Thanks and let me know if there are other questions, Jens
Is there a need to classify the output of CoRAL or are all circular paths/amplicons output estimated to be ecDNA specifically? Thanks again for the tool