etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
501 stars 162 forks source link

precision the CN to decimal and annotate the area where the copy occurs #797

Open user-tq opened 1 year ago

user-tq commented 1 year ago

image I'm trying to reproduce the result of a comparison, which is its standard answer. I use the following command line (The only original files I get are fastq and bed,I use the best practices of GATK to generate bam)

cnvkit.py  batch  ${tumor_bam}   -n  ${norm_bam}  -t ${tar_bed}    -f  ${hg19_fa} --access ${hg19_ac_bed}  --output-reference  ${out_dir}/my_flat_reference.cnn  -d  ${out_dir}  --annotate ${hg19flat}

image The final result made me a little confused: 1.The segment division of AXIN1 is a little strange. It is too large to detect copy number variation. 2.Exon 20 of MSH4 is divided into another segment.The mutation region is only exon11-19?This is different from the copying area of the standard answer. 3.Can cnvkit output accurate decimals? I'm curious about the difference between it and the standard answer.

Thanks for any suggestions

user-tq commented 1 year ago

image I'm trying to reproduce the result of a comparison, which is its standard answer.我试图重现一个比较的结果,这是它的标准答案。 I use the following command line我使用以下命令行 (The only original files I get are fastq and bed,I use the best practices of GATK to generate bam)(我得到的唯一原始文件是 fast q 和 bed,我使用 GATK 的最佳实践来生成 bam)

cnvkit.py  batch  ${tumor_bam}   -n  ${norm_bam}  -t ${tar_bed}    -f  ${hg19_fa} --access ${hg19_ac_bed}  --output-reference  ${out_dir}/my_flat_reference.cnn  -d  ${out_dir}  --annotate ${hg19flat}

image The final result made me a little confused:最后的结果让我有点困惑: 1.The segment division of AXIN1 is a little strange. It is too large to detect copy number variation.1. AXIN1基因片段分裂有些奇怪,大到无法检测拷贝数变异。 2.Exon 20 of MSH4 is divided into another segment.The mutation region is only exon11-19?This is different from the copying area of the standard answer.2.MSH4的外显子20被分成另一个片段。变异区域只有外显子11-19?这与标准答案的复制区域不同。 3.Can cnvkit output accurate decimals? I'm curious about the difference between it and the standard answer.3. cnvkit 能输出精确的小数吗? 我很好奇它和标准答案的区别。

Thanks for any suggestions 谢谢你的建议

I realize that in batch mode, CNVkit merges antitarget and target data together. However, for any project that uses a BED file as the target region, this is counterintuitive because antitarget data usually includes off-target regions, and merging them with target data would lead to unintended analysis and calculation of those areas, which is not normal.

Furthermore, using the final call.cns results and intersecting them with the BED file is an incorrect approach because the partitioning and calculation of cns at this point are based on the wrong cnr, which includes off-target regions. To avoid this situation, it is necessary to use the cnr file to intersect with the BED file and then perform the CNV calling operation to obtain correct results.