Open skhalid7 opened 7 years ago
SNPs in the genome (VCF) are not evenly spaced. The algorithm samples loci such that only one locus per interval width (specified by snp.nbhd in preProcSample) is used. This is stated clearly in the paper (https://www.ncbi.nlm.nih.gov/pubmed/27270079). In order to get reproducible results one needs to set the random number generator seed. I will add this to the package documentation.
As for changing the interpretation of the results, I find that hard to believe. The two lines of results you have included just says that the first segmentation has a long (1040 loci) segment whereas the second run found a narrow change within it.
Venkat
Thanks for your prompt reply! This makes more sense now. Just an added question, What does NA mean? Is it equivalent to 0 (which I interpret as Loss of Allele). I couldn't find a mention of that in the documentation.
NA occurs when the information is not sufficient to determine the allelic copy number states (i.e. too few hets). For instance a total copy number of 3 can occur as 0+3 or 1+2 and algorithm can't tell which is the right call for that segment.
Venkat
Hey I used FACETS to get observe copy number and loss of allele variations. I first created a snp-pileup file using the flag command -g -q15 -Q20 -P100 -r25,0 vcffile outputfile normalbam tumorbam
I then ran the facets commands preProcSample, procSample and emncf. I ran the facets commands on the same data twice and it seemed to give me different results (different enough to change interpretation of results) Here are the headers of two files which should had been the exact same: header 1: chrom seg num.mark nhet cnlr.median mafR segclust cnlr.median.clust mafR.clust start end cf.em tcn.em lcn.em 1 1 1 1040 71 -0.301821193796061 0.704469557320904 37 -0.31481950340943 0.47804971563412 69400 12333300 0.333040756489311 2 0
2 1 2 141 14 0.0325177930781385 1.06498173921531 71 0.0582105493902717 NA 12335800 12855700 0.319892626137736 4 NA
header 2: chrom seg num.mark nhet cnlr.median mafR segclust cnlr.median.clust mafR.clust start end cf.em tcn.em lcn.em 1 1 1 196 14 -0.30655282913359 0.480031041349792 45 -0.287724432912274 NA 69400 6228400 NA 2 NA 2 1 2 76 3 -1.06520822385462 1.72439634993299 13 -1.06520822385462 NA 6246700 6704800 0.355586209444648 0 0