abyzovlab / CNVpytor

a python extension of CNVnator -- a tool for CNV analysis from depth-of-coverage by mapped reads
MIT License
178 stars 26 forks source link

Many <DUP> has CN=2 and not CN=3 #199

Closed mjlarsen closed 10 months ago

mjlarsen commented 10 months ago

Hi We have notised that many, but nor all, duplications are called as dup in the ALT field, but with a CN value of 2 instead of 3.

This results in a unwanted behavier in downstream analysis software that we use (VarSeq) as it will recognize such CNV wth "Normal" copy number.

Best, Martin

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  97brhimuf-112438395833-Normal_DNA_noinfo-WGS_v2_IlluminaDNAPCRFree_6251-230331_A00606_AHW5W3DSX5-MOMA_ERIK_WGS-noinfo
chr1    16101   97brhimuf-112438395833-Normal_DNA_noinfo-WGS_v2_IlluminaDNAPCRFree_6251-230331_A00606_AHW5W3DSX5-MOMA_ERIK_WGS-noinfo_CNVnator_dup_2    N       <DUP>   .       GM_CNV_PoN      END=19700;IMPRECISE;SVLEN=3600;SVTYPE=DUP;natorP1=0.000386154;natorP2=2.66298e+09;natorP3=0.40509;natorP4=1.71282e-27;natorPE=0;natorQ0=0.5567;natorRD=2.1074;GM_CNV_AC=972;GM_CNV_AF=0.460445;OCC=606;FRQ=0.47566718995290425  GT:CN:PE        1/1:2:0
arpanda commented 10 months ago

Hi, As per natorRD=2.1074 field in the info column, it's calling dup.

Regarding the CN field, could you kindly specify the version of the cnvpytor you are utilizing? If you are not using the latest version, please use the latest version.

Here is the command to install latest version:

pip install git+https://github.com/abyzovlab/CNVpytor.git

If the issue persists, could you consider sharing the pytor file? If you're able to, please send it to panda.arijit@mayo.edu.

-- Arijit

mjlarsen commented 10 months ago

Hi, CNVpytor v1.2.1 was used. Do you think that could be the reason ? I don't see anything in the change log that could indicate that this error was fixed. The analysis is done by a core facility, so unfortunatly I don't have access to the pytor file at the moment. I may be able to get it. Best, Martin

arpanda commented 10 months ago

HI Martin, We have made code enhancements to the VCF output functionality in CNVpytor v1.3.1 (Reference: https://github.com/abyzovlab/CNVpytor#new-in-version-131). I think, it should address your concern.

--Arijit

xiucz commented 10 months ago

Hi, @arpanda After updating to v1.3.1, it seems that this issue still persists.

N3  rd_mean_shift   duplication chr19   41900001.0  45000000.0  3100000.0   0.28151381343936505 0.007010645889407209    2592325514.5613227  0.007010645889407209    2592325514.5613227  0.0 0.0 14119100.0

or vcf format:

chr19   41900001    CNVpytor_dup211 N   <DUP>   .   PASS    END=45000000;IMPRECISE;SVLEN=3100000;SVTYPE=DUP;pytorRD=0.281514;pytorP1=0.00701065;pytorP2=2.59233e+09;pytorP3=0.00701065;pytorP4=2.59233e+09;pytorQ0=0;pytorPN=0;pytorDG=14119100;pytorCL=rd_mean_shift   GT:CN   0/1:3

Best, xiucz

arpanda commented 10 months ago

Hi @xiucz, I think, it's an issue for the call. @suvakov is checking the such calls.

could you consider sharing the pytor file? If you're able to, please send it to panda.arijit@mayo.edu

-Arijit

xiucz commented 10 months ago

@arpanda I have sent the pytor file via gmail.

arpanda commented 10 months ago

Hi @xiucz, The shared pytor file is from a very low coverage sample. Here are the read depth statistics:

2023-11-08 10:52:07,286 - cnvpytor.viewer - INFO - RD stat for Autosomes: 2.27 +- 1.84
2023-11-08 10:52:07,315 - cnvpytor.viewer - INFO - RD stat for X/Y: 1.63 +- 1.04

Because of this, the normalization is not working properly.

I'm closing this issue. Please feel free to reopen it or create a new issue if the problem continues.

-Arijit