etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
547 stars 166 forks source link

Filtering single-cell false positive calls #907

Closed gevro closed 1 month ago

gevro commented 2 months ago

Hello, Single-cell capture data has a fair number of false positive calls, but the 'weight' column seems to do a good job at filtering them out.

Is it reasonable to use 'weight' as a filter? It seems to perform very well, so I'm curious why filtering on weight is not a standard option?

etal commented 1 month ago

Feel free to apply the filter in your pipeline if it helps; CNVkit is a toolkit. It's sometimes helpful to delete misbehaving bins from the reference profile (reference.cnn) where you see recurring false positives.

By default, the bins with lower "weight" will have less influence on the segmentation and segment means, and bins with a weight of 0.0 will be essentially ignored.