etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
520 stars 163 forks source link

No coverage on CDKN2A gene caused false-negative CN segment #683

Open zhuhenan opened 2 years ago

zhuhenan commented 2 years ago

Hi,

I got an issue with CDKN2A deletion. My project has multiple glioblastoma samples which are known to have CDKN2A complete deletion. However, instead of extremely low coverage, most of my samples have no coverage on CDKN2A when deletion occurred. So my CNR files usually recorded -28 with 0 coverage for CDKN2A. When I tried to do segmentation with CBS, the CDKN2A gene is tended to be ignored and filled with CN log2 value from adjunct regions.

I am not sure if this is the normal case. Some details for the sample preparation: we have both patient and xenograft samples. All xenograft samples were purified by flow cells to remove mouse cells and normal tissue cells. Then we did WES for all samples with Illumina Highseq and Novaseq for ~150x or higher. We also used bbsplit to remove mouse reads contamination. Bam files were generated using BWA mem for alignment and GATK packages for marking duplications and recalibrating base quality scores. Due to unknown reasons, most CDKN2A genes have no coverage when deletion happened (deletion was confirmed by target sequencing by other cooperators). I also tried to force reads to map the CDKN2A gene and only received a bad alignment with extremely bad mapping quality scores. This issue happened to both patient and xenograft samples.

To get CNV segments, I have tried several combinations of --drop-low-coverage, --drop-outlier, and different p-value thresholds for the CNVkit segment function. Currently, I found if I used CBS methods, --drop-low-coverage has to be turned off for most of the CDKN2A deletion samples but it must be turned on for certain CDKN2A deletion cases. Also, I got a set of samples CBS method will never call CDKN2A deletion properly no matter how I changed the combinations.

Would you mind providing some guidance for my data preparation and segment method settings? I have attached an example CNR file for chr9 from one of my CDKN2A deletion samples.

example.zip

Best, Henan

etal commented 2 years ago

It sounds like your tumor samples are too pure for the assumptions that go with bulk tissue samples; I think you need to turn off --drop-low-coverage everywhere.

etal commented 2 years ago

You might also do well with the bintest command to supplement your segmentation results. This will give you exon-level calls, which can be noisy for whole exomes, but in this case you know which alterations you're looking for so grep will solve that problem.