Questions about generating CNV Input file

Hello PhyloWGS Devs!

I'm currently trying to run PhyloWGS with copy number variation (CNV) data obtained from whole-exome sequencing. I used a different tool than Battenburg / TITAN to call CNVs and am trying to convert the CNV calls into a format similar to the provided cnv_data.txt.

However, I am having trouble understanding how to calculate the number of reference reads a covering a given CNV.

Our copy number calls give us the integer copy numbers of each allele and the prevalence of the CNV, e.g. (2,1) with prevalence 0.25. We also have reference and variant read counts for each SSM.
How would I calculate a from the above information? My default presumption is to multiply the CNV prevalence by total read count, but I was wondering if you had a different recommendation.

Additionally, I wanted to clarify my understanding of the example cnv_data.txt input file provided:

cnv a   d   ssms    physical_cnvs
c0  66023,50883,62757,36056,58777   126755,100469,121941,71263,115417   s2,1,2;s4,0,1   chrom=1,start=1234,end=5678,major_cn=2,minor_cn=1,cell_prev=0.8;chrom=X,start=15,end=10000,major_cn=2,minor_cn=0,cell_prev=0.8;chrom=22,start=123,end=456,major_cn=1,minor_cn=0,cell_prev=0.8

This example shows that SSMs s2 and s4 overlap with CNV c0.

However, they appear to harbor different major/minor copy numbers. s2 harbors (1,2) whereas s4 harbors (0,1).
How can the same CNV have two different copy number states in the input?
Furthermore, why does the same CNV have different states in the last column? I see copy number states (2,1), (2,0), and (1,0) in the last column, but I thought a single CNV should have a single copy number state.

Thanks!

morrislab / phylowgs

Questions about generating CNV Input file #141