Closed kmavrommatis closed 2 months ago
Hey!
The logic is valid. I think sample and sample.1 are the clonalities (note sample cell fraction, not cancer cell fraction) of the variant in the samples.
The river output deals with clonalities as opposed to VAFs, so you wont find VAF information there, although you can reverse-calculate it by matching against local CNA as you suggest. Probably better to look at somaticVariants.xls (or .csv), but that file only has information in samples where it's called, not across all samples. Look at multisample as well, which is across all samples but VAF information. I haven't touched the multisample output in a few years though, so that might be a bit dated. The scatter plots, especially clones.png, can be a good viz of the VAFs in different clones otherwise, but maybe you want the numbers.
The best way to access the raw data otherwise is from the R output in Rdirectory/myIndividual/allVariants.Rdata, which is a nested list where allVariants$variants$variants$mySample is a data frame with all information about all variants that are present in the VCF from any sample in that individual.
Which is assigned to clone 2. Clone 2 is a clone with abundance 51%. Assuming this position is Het, it means that the VAF is ~25% which is within expected value. How can I confirm the VAF of this mutation, or rather find the information if it is homozygous or heterozygous in this clone? What do the values under sample and sample.1 mean?
Is this logic valid or am I missing something? Thanks in advance for your help