Open DominikGlodzik opened 4 years ago
The VCF file here is not specific to Theta, it's used by CNVkit to extract b-allele frequencies. The format is the same as for other CNVkit commands: https://cnvkit.readthedocs.io/en/stable/fileformats.html#vcf
Hi, I have similar question on this. We are using CNVkit on Panel of Normal, because we don't have matched tumor-normal samples. In this case, can we create Sample_Paired.vcf by combining a vcf from one tumor sample and several vcfs from all normal samples that were used for generating PoN with PEDIGREE tag to the VCF header?
Best, Hyunjun
If you don't have a matched normal for each tumor sample, you can use an unmatched normal instead -- you only need 1 normal. The intent is to identify the likely germline-heterozygous SNPs that are present in the tumor sample, because these can be used to determine tumor fraction, whereas somatic mutations' allele frequencies are too noisy/heterogeneous to use here.
You can get the same effect independently by using the PoN or even dbSNP or other population genetics databases to filter the tumor-only VCF down to population SNP sites.
Hello CNVkit team
I wonder if you advise on the specification of VCF format for the Theta export command:
cnvkit.py export theta Sample_Tumor.cns reference.cnn -v Sample_Paired.vcf
What information is required in the VCF file, and how should it be formatted (eg. info vs format fields)?I could not find this information in the documentation, and my trial and error failed.
Best wishes Dominik