AmpliconSuite / AmpliconClassifier

Classify output of AmpliconArchitect to detect types of focal amplifications present
BSD 2-Clause "Simplified" License
14 stars 11 forks source link

About the difference between the classfication of the current version and the legacy_natgen_2020 #14

Open xieduo7 opened 1 year ago

xieduo7 commented 1 year ago

Hi @jluebeck ,

I have three questions regarding the classification of the amplicon.

  1. There are some differences (see image below) between the current version and the legacy_natgen_2020 (from the paper) so my question is whether the circular amplification and heavily rearranged amplification in legacy_natgen_2020 are equivalents of ecDNA and Complex non-cyclic, respectively?

image

  1. The legacy_natgen_2020 paper states:

While an amplicon may fit the requirements for several categories (that is, a circular amplicon may also comprise heavily rearranged amplifications), priority was given to the BFB amplification category, followed by circular, heavily rearranged, and then linear.

So my 2nd question is: Do you happen to have a suggestion for the priority of the current version's classfication? Is BFB amplification > ecDNA > Complex non-cyclic > linear?

  1. In the readme of AmpliconClassifier, it says:

    Note that Cyclic can refer to either BFB or ecDNA.

But I found that some Complex non-cyclic are also labeled as BFB. Does this make sense? If so, what priority should this amplicon have?

Thank you!

Best, Duo

jluebeck commented 1 year ago

Hi Duo,

  1. Your table is correct.
  2. In the 2020 paper, an amplicon could be only one class, however in the latest versions of AC, AA amplicons may contain multiple focal amplification features - e.g. they may contain an ecDNA in one part of the amplicon and a BFB elsewhere, so they will be marked as ecDNA+ and BFB+. My suggestion is not to enforce a heirarchical classification system in which a sample is annotated only by one class regardless of other focal amps present, as this will result in a loss of information. The classifier itself first checks that cycles are not consistent with BFB, then evaluates the ecDNA status. However, this is different than applying a heirarchy to the detected focal amplifications.
  3. Yes, cyclic can refer to ecDNA or BFB. However complex non-cyclic can also refer to BFB. If you are enforcing a heirarchical classification system, then BFB should take priority, by my estimation.

Thanks, Jens