etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
502 stars 163 forks source link

Different sex estimated from cnr and cns files #785

Open ImNotaGit opened 1 year ago

ImNotaGit commented 1 year ago

When I used the cnvkit.py sex command to infer sample sex, for some of my samples the results (Male/Female) was different depending on whether the .cnr or the .cns file was used as input, although the X and Y log2 ratios were practically the same.

An example output is as follows, where the results for sample s1 were consistent, but the results for sample s2 were not between cnr and cns -- in this case, the cns result is apparently wrong. The default "female reference" was used throughout the analysis.

sample  sex     X_logratio      Y_logratio
s1.cnr   Male    -1.02   +0.255
s1.cns   Male    -1.02   +0.255
s2.cnr   Male    -1.03   +0.236
s2.cns   Female  -1.03   +0.235

I discovered this issue when I used the cns files for cnvkit.py heatmap and the X chromosome log2 ratio for some samples did not look right, and the log file also revealed some incorrect sex inference; however, messages about sex inference in the logs files of cnvkit.py batch (which I used to perform my analysis) was correct. I was initially using cnvkit 0.9.4. I tried the latest version to date (0.9.9, installed via anaconda) but the issue persisted.

I may have some difficulty sharing the relevant cns and cnr files due to restrictions of my institution... Please kindly let me know how I can help with further debugging, if necessary. Thanks.