etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
545 stars 165 forks source link

the false negative of segment #833

Open qzhqzh opened 1 year ago

qzhqzh commented 1 year ago

I find a small deletion chr2:71600701-71602850 in DYSF. but when i run cnvkit.py segment -m hmm-germline in.cnr -o out.cns , the deletion is merged to chr2:10500-90402011 which is not a real CNV. I have confirm this deletion is a real positive deletion using QPCR, so i think there is a false negative result of cnvkit.py segment. How can it output the chr2:71600701-71602850 deletion in cns file? Of course, i have run cnvkit.py segment --drop-low-coverage -m hmm-germline in.cnr -o out.cns too.

the cnr lines are:

chr2    71570598        71570741        DYSF    344.84615384615387      -0.30249144841605385    0.9813324134255905
chr2    71574197        71574371        DYSF    338.5977011494253       0.04639229891935749     0.9935302308672166
chr2    71589579        71589699        DYSF    419.7083333333333       -0.06350402781637617    0.9822403060109705
chr2    71590189        71590309        DYSF    418.9583333333333       0.1276053443646271      0.9709310859128705
chr2    71598563        71598745        DYSF    406.84615384615387      -0.04497376934547723    0.9902599453992517
chr2    71600701        71600842        DYSF    195.24113475177305      -0.9822075671662628     0.9753169458242956
chr2    71601453        71601573        DYSF    235.35  -0.8695601481590665     0.9836282601368704
chr2    71602730        71602850        DYSF    198.55  -0.8324236907341246     0.9863254496609705
chr2    71611235        71611355        DYSF    364.2583333333333       0.05401022996463663     0.9818654874605705
chr2    71611464        71611626        DYSF    390.41975308641975      -0.08049284125639751    0.9849266885234083
chr2    71612640        71612806        DYSF    345.578313253012        -0.10463650776802523    0.9668357157464919
chr2    71613311        71613431        DYSF    416.51666666666665      0.08575719564731492     0.9776661461000704

the cns line is:

chr2    10500   90402011        FAM110C,SH3YL1,ACP1,ALKAL2,LOC100996637,TMEM18,SNTG2,TPO,PXDN,MYT1L,EIPR1,TRAPPC12,ADI1,RNASEH1,RPS7,COLEC11,ALLC,DCDC2C,SOX11,CMPK2,RSAD2,RNF144A,RNF144A,LOC101929452,ID2,KIDINS220,MBOAT2,ASAP2,ITGB1BP1,CPSF3,IAH1,ADAM17,YWHAQ,TAF1B,GRHL1,KLF11,CYS1,RRM2,HPCAL1,ODC1,NOL10,ATP6V1C2,ATP6V1C2,PDIA6,PDIA6,KCNF1,C2orf50,SLC66A3,ROCK2,E2F6,GREB1,NTSR2,LPIN1,TRIB2,LRATD1,NBAS,DDX1,MYCNOS,MYCN,MYCN,CYRIA,RAD51AP2,VSNL1,SMC6,GEN1,MSGN1,KCNS3,RDH14,NT5C1B-RDH14,NT5C1B-RDH14,NT5C1B,OSR1,TTC32,WDR35,LOC101928222,MATN3,MATN3,LAPTM4A,SDC1,PUM2,RHOB,LOC107985856,HS1BP3,GDF7,LDAH,APOB,TDRD15,KLHL29,ATAD2B,UBXN2A,MFSD2B,WDCP,FKBP1B,SF3B6,FAM228B,TP53I3,FAM228B,PFN4,FAM228B,FAM228A,ITSN2,NCOA1,PTRHD1,PTRHD1,CENPO,CENPO,CENPO,ADCY3,ADCY3,DNAJC27,EFR3B,POMC,DNMT3A,DTNB,ASXL2,KIF3C,RAB10,GAREM2,HADHA,HADHB,ADGRF3,ADGRF3,SELENOI,SELENOI,DRC1,OTOF,FAM166C,CIB4,KCNK3,SLC35F6,CENPA,DPYSL5,MAPRE3,TMEM214,AGBL5,OST4,EMILIN1,KHK,KHK,CGREF1,CGREF1,ABHD1,PREB,PRR30,TCF23,SLC5A6,ATRAID,CAD,SLC30A3,DNAJC5G,TRIM54,MPV17,GTF3C2,GTF3C2,GTF3C2-AS1,EIF2B4,SNX17,ZNF513,PPM1G,NRBP1,KRTCAP3,KRTCAP3,IFT172,IFT172,FNDC4,GCKR,C2orf16,ZNF512,CCDC121,CCDC121,GPN1,GPN1,SUPT7L,SUPT7L,SLC4A1AP,SLC4A1AP,LOC105374378,MRPL33,RBKS,RBKS,BABAM2,BABAM2-AS1,BABAM2,BABAM2,LOC100505716,FLJ31356,FOSL2,FOSL2,PLB1,PPP1CB,SPDYA,SPDYA,TRMT61B,TRMT61B,WDR43,TOGARAM2,PCARE,CLIP4,ALK,YPEL5,LBH,LCLAT1,CAPN13,GALNT14,CAPN14,EHD3,XDH,SRD5A2,MEMO1,DPY30,SPAST,SLC30A6,NLRC4,YIPF4,BIRC6,TTC27,LTBP1,RASGRP3,RASGRP3,RASGRP3-AS1,FAM98A,CRIM1,FEZ2,VIT,STRN,HEATR5B,GPATCH11,EIF2AK2,SULT6B1,CEBPZOS,CEBPZOS,CEBPZ,CEBPZ,NDUFAF7,NDUFAF7,PRKD3,PRKD3,QPCT,CDC42EP3,RMDN2,RMDN2,RMDN2-AS1,CYP1B1,ATL2,HNRNPLL,GALM,SRSF7,GEMIN6,DHX57,MORN2,ARHGEF33,ARHGEF33,LOC375196,SOS1,CDKL4,MAP4K3,TMEM178A,THUMPD2,SLC8A1-AS1,SLC8A1,SLC8A1,C2orf91,PKDCC,EML4-AS1,EML4,EML4,COX7A2L,KCNG3,MTA3,OXER1,HAAO,ZFP36L2,THADA,PLEKHH2,PLEKHH2,C1GALT1C1L,DYNC2LI1,DYNC2LI1,ABCG5,ABCG5,ABCG8,LRPPRC,PPM1B,SLC3A1,SLC3A1,PREPL,PREPL,CAMKMT,SIX3,SIX2,SRBD1,PRKCE,EPAS1,TMEM247,ATP6V1E2,RHOQ,RHOQ,RHOQ-AS1,RHOQ,PIGF,PIGF,CRIPT,SOCS5,MCFD2,MCFD2,TTC7A,TTC7A,STPG4,CALM2,EPCAM,MSH2,KCNK12,MSH6,FBXO11,FOXN2,PPP1R21,STON1-GTF2A1L,STON1,STON1-GTF2A1L,GTF2A1L,STON1-GTF2A1L,LHCGR,STON1-GTF2A1L,FSHR,NRXN1,ASB3,GPR75-ASB3,ASB3,GPR75-ASB3,CHAC2,GPR75-ASB3,ERLEC1,GPR75-ASB3,GPR75-ASB3,GPR75,PSME4,ACYP2,ACYP2,TSPYL6,C2orf73,SPTBN1,EML6,RTN4,CLHC1,RPS27A,RPS27A,MIR4426,MTIF2,CCDC88A,CFAP36,PPP4R3B,PNPT1,EFEMP1,LOC100129434,CCDC85A,VRK2,VRK2,FANCL,FANCL,BCL11A,PAPOLG,REL,PUS10,PUS10,PEX13,PEX13,KIAA1841,C2orf74,AHSA2P,AHSA2P,USP34,USP34,XPO1,FAM161A,CCT4,COMMD1,B3GNT2,TMEM17,EHBP1,EHBP1,LOC100132215,OTX1,WDPCP,MDH1,UGP2,VPS54,PELI1,LGALSL,AFTPH,SERTAD2,SLC1A4,CEP68,RAB1A,ACTR2,SPRED2,MEIS1,MEIS1,MEIS1-AS2,ETAA1,C1D,WDR92,PNO1,PPP3R1,CNRIP1,PLEK,FBXO48,APLF,PROKR1,ARHGAP25,BMP10,GKN2,GKN1,ANTXR1,GFPT1,NFU1,AAK1,ANXA4,ANXA4,LOC107985770,GMCL1,SNRNP27,MXD1,ASPRV1,PCBP1,C2orf42,TIA1,PCYOX1,SNRPG,FAM136A,TGFA,ADD2,FIGLA,CLEC4F,CD207,VAX2,ATP6V1B1,ATP6V1B1,ATP6V1B1-AS1,ANKRD53,TEX261,NAGK,MCEE,MPHOSPH10,PAIP2B,ZNF638,DYSF,CYP26B1,EXOC6B,SPR,EMX1,SFXN5,RAB11FIP5,NOTO,SMYD5,PRADC1,CCT7,FBXO41,EGR4,ALMS1,NAT8,NAT8B,TPRKB,DUSP11,C2orf78,STAMBP,ACTG2,DGUOK,DGUOK,DGUOK-AS1,TET3,BOLA3,MOB1A,MTHFD2,SLC4A5,DCTN1,C2orf81,WDR54,RTKN,INO80B-WBP1,INO80B,INO80B-WBP1,WBP1,MOGS,MRPL53,CCDC142,TTC31,LBX2,LBX2,LBX2-AS1,PCGF1,TLX2,DQX1,AUP1,AUP1,HTRA2,HTRA2,HTRA2,LOXL3,LOXL3,LOXL3,DOK1,DOK1,M1AP,SEMA4F,HK2,POLE4,TACR1,EVA1A,MRPL19,GCFC2,LRRTM4,REG3G,REG1B,REG1A,REG3A,CTNNA2,CTNNA2,LRRTM1,SUCLG1,DNAH6,TRABD2A,TMSB10,KCMF1,TCF7L1,TGOLN2,RETSAT,ELMOD3,CAPG,SH2D6,MAT2A,GGCX,VAMP8,VAMP5,RNF181,TMEM150A,USP39,C2orf68,USP39,SFTPB,GNLY,ATOH8,ST3GAL5,POLR1A,PTCD3,IMMT,MRPL35,REEP1,KDM3A,CHMP3,RNF103-CHMP3,RNF103-CHMP3,RNF103,RNF103-CHMP3,RMND5A,RMND5A,CD8A,CD8B,RGPD2,RGPD2,RGPD1,RGPD2,RGPD1,PLGLB1,PLGLB2,PLGLB1,PLGLB2,ANAPC1P4,PLGLB2,PLGLB1,PLGLB2,PLGLB1,RGPD1,RGPD2,RGPD1,RGPD2,RGPD1,KRCC1,SMYD1,FABP1,THNSL2,FOXI3,TEX37,LOC101928371,EIF2AK3,EIF2AK3,RPIA       -0.013314783324661716   2       338.95539848303815      7426    7187.519800894999

cnvkit_data.zip

zhuying412 commented 1 year ago

for method “hmm-germline”,to set the window(-t)== 2 can resolve it. but it is not active in hmm.py cnarr["log2"] = cnarr.smooth_log2() # window). we need rewrite it with cnarr["log2"] = cnarr.smooth_log2(window), can you fix it? @etal

etal commented 1 year ago

I've merged @Zhu-Ying 's pull request. Are you able to try your analysis of this sample again with the development version of CNVkit, @qzhqzh ?