Creation of patient tree clones

Hi Florian, Here is a detailed description of the data format from the json files. For each sample we provide a json file with the following data:

Patient's sample format:

id: str, sample name

patient: str, name of patient

cohort: str, cohort of the patient

OS: float, (optional) survival time

PFS: float (optional) progression free survival time

status: int (optional) dead/alive

HLA_genes: list of str, hla alleles of the patient (optional)

mutations: list

   all mutations observed across all samples of the patient, for each mutation
   report:

       id: str
           format <chrom>_<position>_<ref_nucleotide>_<alt_nucleotide>

       gene: str
         gene name

               missense: int
             1 if missense else 0
      e.g.
  {
    "id": "1_12172228_G_A",
    "gene": "TNFRSF8",
    "missense": 0
  }

neoantigens: list

     all neoantigens observed across all samples of the patient, for each neoantigen report:

       id: str
           format <chrom>_<position>_<ref_nucleotide>_<alt_nucleotide>_<mutated_position>_<peptide_length>_<HLA_allele>
       mutation_id: str
       HLA_gene_id: int
       sequence: str
       WT_sequence: str
       mutated_position: int
       Kd: float
       KdWT: float

       e.g.
       {
        "id": "19_44352078_G_A_5_9_C0303",
        "mutation_id": "19_44352078_G_A",
        "HLA_gene_id": "HLA-C03:03",
        "sequence": "KAFSHGYHL",
        "WT_sequence": "KAFSRGYHL",
        "mutated_position": 5,
        "Kd": 29.0,
        "KdWT": 30.0
       }

sample_trees: list of trees tree format described below

Tree format:

 topology: Node
         root clone node of the tree, Node format described below

 score: float, log-likelihood score (from PhyloWGS)

Node format:

 clone_id: int

 clone_mutations: list
               list of mutation identifiers that originate in that clone, eg.
           ["20_16360370_G_C", "2_89869798_C_A", ...]

  children: list of children nodes, in Node format

  X: float,
     cecular cancer fraction, CCF

  x: float, exclusive frequency (see eq. 8)

  new_x: float, frequency if this is a new clone (optional)

LukszaLab / NeoantigenEditing

Creation of patient tree clones #1