Smoothing window fraction fix results in noisier data in targeted sequencing panel?

etal / cnvkit

Copy number variant detection from targeted DNA sequencing

Other

501 stars 162 forks source link

We're currently evaluating an update from version 0.9.9 to 0.9.10. However, we're seeing some increased noise among the ratios and resulting in an increased rate of false positives. Specifically, looking at the IQRs:

We've traced it back to this commit; I noticed that the window fraction now defaults to a limit of 0.01 (from the flat 0.1). Could this be under-correcting the log2 ratios during the fix step?

$ wc -l sample.target.coverage.cnn
43719 sample.target.coverage.cnn

$ wc -l sample.antitarget.coverage.cnn
3143 sample.antitarget.coverage.cnn

$ cnvkit.py fix --no-gc sample.target.coverage.cnn sample.antitarget.coverage.cnn sample.reference.cnn -o sample.cnr -i sample

I've opened up a PR to allow manual tuning of this parameter during the fix step: https://github.com/etal/cnvkit/pull/860

etal / cnvkit

Smoothing window fraction fix results in noisier data in targeted sequencing panel? #859