Open HenrikBengtsson opened 9 years ago
As a start, I plan to use a single weight vector w
(Approach 1 above). Only after this step, I'll consider adding support for an optional second weight vector (Approach 2).
Hi Henrik,
I use CBS and PSCBS to segment whole exome sequencing data. For CBS, I use a pool of normals to find good weights, i.e. setting it proportional to the inverse of log-ratio standard deviations in the pool. B-allele frequencies in whole exome have much less bias than coverage and biases are not necessarily correlated. I think it would be useful if the weights could be restricted to one of the steps. Two different weights would be a bonus (not sure I need it).
Thanks for your great packages, Markus
Thanks @lima1 for the feedback. To support optional weights in either step, I think we basically need to implement Option 2 (weights specific to each of TCN and DH).
A half-way approach between implementing Option 1 (same weights for TCN and DH) and Option 2, would be a third option for TCN-only weights ignoring DH weights until Option 2 is implemented. That might be most straightforward step for supporting weighted PSCBS segmentation, especially since segmentByCBS()
already supports weights. I've updated my top comment with Option 3.
Option 3 would be perfect for me. Thanks again!
Hi Henrik, I'm quite happy with the weighting implemented in my PR and tested it on many samples. It cleans up some of the noisy regions quite a bit.
If you have any concerns with the current patch, happy to work on it more to get it in the next PSCBS release.
Thanks again! Markus
Add support for weights to
segmentByPairedPSCBS()
andsegmentByNonPairedPSCBS()
. This can be done in at least two different ways:One rationale for separate weight vectors is that one might want to use weights for the DH segmentation that are a function on, say, the confidence scores of the genotype calls.
UPDATE (2016-04-24): Added third option of only TCN weights.