etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
560 stars 166 forks source link

CNVkit cnv_ztest with low log2 values #534

Open smerella opened 4 years ago

smerella commented 4 years ago

Dear all, I am using CNVkit to analyse exome sequencing data (germline samples) in order to find new copy number variants. For the pipeline analysis I followed all the found suggestions for germline samples and here are the steps:

1) create a reference using all samples cnvkit.py batch --normal ALL_BAM/*.bam --targets Exome_SureSelect_QXTV7_forCNVkit.bed --fasta hs37d5.fa --access access-5k-mappable.hg19.bed --output-reference reference.cnn 2) batch command using the previously created reference cnvkit.py batch ALL_BAM/*.bam -r reference.cnn -d results/ 3) add ci column

cd results/
for i in *.cnr ; do cnvkit.py segmetrics -s `basename ${i%%.cnr}`.cn{s,r} --ci ; done

4) call command for i in *segmetrics* ; do cnvkit.py call $i --filter ci -m clonal --center mode -o call_cnvkit/basename ${i%%segmetrics}.call.cns; done 5) cnv_ztest `for i in .cnr; do cnv_ztest.py $i -t -s call_cnvkit/basename ${i%%.cnr}..call.cns -o cnv_ztest/basename ${i%%.cnr}.ztest.cnr; done`

I am now looking at cnv_ztest results but I am not understanding log2 values reported in the file because they have quite negative values in all samples (like between -21 and -8). Is there something I am missing? Or probably is there an error in my pipeline? I also had a look at this thread but because I am using germline samples I didn't use the --drop-low-coverage as suggested. Do you have any idea about what is going wrong? Thanks in advance!

Stefania

etal commented 4 years ago

There was a bug in the cnv_ztest calculation in CNVkit v.0.9.6. Could you try using the equivalent bintest command in the more recently released CNVkit 0.9.7? The bug was fixed there and should report more meaningful bin-level statistics.