etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
545 stars 165 forks source link

Running CNVkit with VCF files #886

Open GACGAMA opened 3 months ago

GACGAMA commented 3 months ago

Hello! After reviewing the full documentation, I got a question that was not clear on it:

Lets say I'm analyzing a cohort of tumor-normal samples, but I also have available hundreds of unrelated controls

I call my segments by using a pool of normals, because as in the manual, this has higher sensibility.

Now I have a reference with all my normal samples, including the paired normal ones, and one .CNS file for each tumor.

But then, lets say I want to use SNP frequency to identify LoH

cnvkit.py call Sample.cns -y -v Sample.vcf -m clonal -o Sample.call.cns

Which samples should be present in that VCF? Just the normal-tumor pair?

What if I want to run TheTA2 to estimate tumor purity?

cnvkit.py export theta Sample_Tumor.cns reference.cnn -v Sample_Paired.vcf

Should that paired vcf also contain only the normal-tumor pair, even tough the CNVs were called with a pool of normals?