Closed DarioS closed 1 year ago
PURPLE will try to find a fit which tries to make the highest proportion of the genome into integer copy values. This is constrained by penalties for less biologically plausible scenarios (ie. higher copy number). So in your example it may fit 28%, 44% or 72% depending on the relative amount of copy number that is shared between the clones and private to each one. If there is a lot of private copy number activity to each clone, then the fit is likely to fail to find a good solution
Recently, a couple of algorithms CopyKAT and SCEVAN for analysing single-cell RNA sequencing data have identified that some tumours are composed of two or three main clones.
How does that correspond to PURPLE's estimate of tumour purity? Would it be 44% + 28% = 72% (i.e. cancer cell fraction)? Or would it be reported as 44% (i.e. largest pure tumour group)? It might be useful to provide some clarifying sentences in the user guide.