abyzovlab / CNVnator

a tool for CNV discovery and genotyping from depth-of-coverage by mapped reads
Other
212 stars 66 forks source link

An error while running in a partial region #285

Open vegetableyu opened 10 months ago

vegetableyu commented 10 months ago

Hi there, I extracted the region of bam file chr6:28000000-34000000, and ran CNVnator as follows: step1: cnvnator -root ${SID}.root -tree ${BAM} step2: cnvnator -root ${SID}.root -his ${BIN_SIZE} -chrom ${CHRlist} -fasta ${USER_REF} step3: cnvnator -root ${SID}.root -stat ${BIN_SIZE} step4: cnvnator -root ${SID}.root -partition ${BIN_SIZE} step5: cnvnator -root ${SID}.root -call ${BIN_SIZE} > ${SID}.rawcnv

The err message of step3 was like: Making statistics for chr6 ... Average RD per bin (1-22) is 0.0600142 +- 0.223212 (before GC correction) Average RD per bin (X,Y) is 0 +- 0 (before GC correction) Correcting counts by GC-content for 'chr6' ... Zero value of GC average. Bin 4245 with center 424450 is not corrected. Zero value of GC average. Bin 4246 with center 424550 is not corrected. Zero value of GC average. Bin 13135 with center 1.31345e+06 is not corrected. Zero value of GC average. Bin 13900 with center 1.38995e+06 is not corrected. Zero value of GC average. Bin 14580 with center 1.45795e+06 is not corrected. Zero value of GC average. Bin 15362 with center 1.53615e+06 is not corrected. Zero value of GC average. Bin 16098 with center 1.60975e+06 is not corrected. Zero value of GC average. Bin 16525 with center 1.65245e+06 is not corrected. Zero value of GC average. Bin 16526 with center 1.65255e+06 is not corrected. Zero value of GC average. Bin 16527 with center 1.65265e+06 is not corrected. Zero value of GC average. Bin 28762 with center 2.87615e+06 is not corrected. Zero value of GC average. Bin 31632 with center 3.16315e+06 is not corrected. Zero value of GC average. .............

Then when I run step5, the program keeps running and doesn't finish. I wonder if this is due to the error in step3 ? And does CNVnator support running in partial chromosomal regions?

Thanks for CNVnator and for your time ! yu

abyzov commented 10 months ago

Hi, CNVnator is designed to work with whole genome data. Results when working with data from a small regions are unpredictable. In this particular case normalization likely failed.

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org http://www.abyzovlab.orgtel: +1-(507)-538-0978

vegetableyu commented 10 months ago

Thanks for reply! I'm also wondering if I might get the correct result when I also extract the corresponding region of the FASTA file and run CNVnator ?

abyzov commented 10 months ago

Do you mean that you consider that regions as a pseudo-genome (pseudo-chromosome). If a region is sufficiently large and there is sufficient coverage you may get some reasonable result. But we’ve never tried such and approach. To make this approach work you’ll have to change bam header to reflect a new pseudo-genome (pseudo-chromosome).

Alexej Abyzov, Ph.D. Senior Associate Consultant, Associate Professor of Biomedical Informatics, Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic

Mayo Clinic, 200 1st street SW, Harwick 7-91 Rochester, MN 55905 www.abyzovlab.org http://www.abyzovlab.orgtel: +1-(507)-538-0978

vegetableyu commented 10 months ago

Yes. I will try this and thanks for your reply!