Closed paularstrpo closed 5 years ago
Clonevol does not make assumption about when the samples are taken. The samples can be taken at the same time point, or many different time points. Clonevol inferred models for individual samples separately and then crosses compare the models between samples to find the concensus model that is valid for all samples. If clonevol inferred a model, it should be valid for both cases: same or different time points.
Because clonevol does not make assumption about time point, its visualization places all samples at the same time point. Clonevol can plot samples at time points provided via bell.starts
parameter in plot.clonal.models
function, eg.
s = c(1,2,5)
names(s) = c('sample1', 'sample2', 'sample3')
plot.clonal.models(..., bell.starts=s)
Also, low VAF suggests normal tissue contamination in your tumor samples. This affects ability to call variants but doesn't affect clonevol inference (given reasonable good variant calls). Everything is just scaled down by the same factor.
Hello,
I use pyclone
to infer subclones. And I have a question about vaf. I use variant_allele_frequenc
estimated by pyclone as vaf input of ClonEvol. What I got about this value are all less than 1. After reading the tutorial of ClonEvol, I found ways of calculating vaf(the ratio of the number of reads carrying the variant and the total number of reads at the site) are the same. It vaf are defined as mentioned above, it should be less than 1, right? However, the test data provided in the tutorial are all more than 1.
VAF is always < 1 (or 100%). In the test data, it is provided as "percentage" and should be <100. Regarding Pyclone as input, you have two options:
(1) Divide cellular fraction estimate from pyclone by 2 to get "CN corrected VAF". This is preferred, but keep in mind Pyclone may limit cellular fraction to 1 which then limit corrected VAF to 0.5. See related discussion in: https://github.com/hdng/clonevol/issues/3 and https://github.com/hdng/clonevol/issues/4
(2) Use uncorrected VAF (calculated as variant reads/total reads) with the assumption that you either don't have CNA affecting our variants or you have even copy gain vs. loss between variants within clusters such that the center (eg. mean VAF) of a cluster is not affected by copy number alteration.
Thanks for your reply.
Hello, regarding the multiple time points question from the original comment, I have a similar situation but my calls are from cell-free DNA at 8 different collection dates (patient has metastatic disease with significant levels of tumor DNA in cfDNA - highest vafs are 40-50%). I used pyclone to cluster the variants and I when I tried to use the bell.starts parameter in plot.clonal.models, I receive the message "Error in plot.clonal.models(y, box.plot = TRUE, bell.starts = s, fancy.boxplot = TRUE, : unused argument (bell.starts = s)" ... it looks like it's not using the argument?
How do I implement this functionality in plot.clonal.model and do you have any other advice for using ClonEvol over this type of data set (longitudinal sampling)?
Thank you for the excellent R package and support!
Hi. I'm trying to do a sciclone-clonevol-fishplot workflow for deriving the clonal evolution of my samples. I have a case study of 4 WES samples: a primary tumor (extracted at time point 1) and three different regional recurrences (extracted at time point 2). My wxs varies from 10X to 30X depth, and my VAFs top out at around 30. They are all from the same tissue organ. I ran sciclone on my samples, and when I feed the results of it into clonevol, the resulting model seems to assume the samples are all at the same time-point. How can I specify the known time-points per sample so that clonevol can take that into account when inferring the model? Is that possible? Further, what do you reccommend in cases like this with low purity tumors?