etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
545 stars 165 forks source link

The problem of false negative in male sex chromosomes #883

Open zhuying412 opened 4 months ago

zhuying412 commented 4 months ago

Hello, I am currently dealing with a false-negative case involving a male sample. When the log2 ratio for genes on the sex chromosomes reaches 1.4, we suspect it might indicate a potential mosic-duplication. However, the do_call() function converts this into a copy number (cn=1) through the operation outarr["cn"] = absolutes.round().astype("int"), which is then recorded in the CNV file.

When exporting to VCF format using the segments2vcf() function, this entry is filtered out because at this point, ncopies==1 and abs_exp==1. This scenario wouldn't occur on diploid chromosomes, where typically would be ncopies==3 and abs_exp==2 for such a scenario.

The relevant code snippet for the filtering logic during VCF export is as follows:

for out_row, abs_exp in zip(out_dframe.itertuples(index=False), abs_expect):
    if (
        out_row.ncopies == abs_exp
        or
        # To accommodate data from the faulty v0.7.1 version (#53)
        not str(out_row.probes).isdigit()
    ):
        # Skip regions with neutral copy number
        continue  # or mark as "CNV" for subclonal events?