Closed erflynn closed 8 months ago
Hi @erflynn,
lambda
is estimated during runtime at the regression step. When lambda=NULL
, for each covariate, lambda (which sets the ridge-regression shrinkage) differs for each cluster k and batch b. Harmony uses an expectation E, which gives the expected cells in the cluster given the current size of the cluster (cells) and the number of cells in that batch. The smaller that number E, the larger lambda gets, which would shrink the correction(correct less that batch in that cluster). This tends to protect against overcorrection in some cases. For simple datasets, I would not set that parameter because we have tweaked other parts of our formula to avoid overcorrection. But if you notice the cost of the objective function increasing with harmony iterations (by setting plot_convergence=TRUE
) then this automatic lambda estimation would be something to try out.
tau
is used here to set theta and scale it according to the number of batches per covariate: theta <- theta * (1 - exp(-(N_b / (nclust * tau))^2))
thank you for the explanation! this is very helpful for understanding lambda estimation and tau!
what was updated in the latest harmony versions that improves the overcorrection?
tweaked other parts of our formula to avoid overcorrection
As described in the manuscript, the diversity penalty now is calculated log(O+E/O)
instead of log(O/E)
, and the optimization process is updated accordingly. If O gets small then the diversity penalty becomes very large. Small O
values are especially likely when non-overlapping cell types exist in the different batches.
lambda estimation
Using the same logic this measure is applied at the regression step. Essentially, for small E
's, more shrinkage is applied for a given batch in a cluster.
that's helpful - thanks for the info! which version includes this update?
Version 1.2 distributed on CRAN and github master branch both implement these changes.
great - thank you!
Re issue #157 @pati-ni - what are the updates to Harmony that mitigate the overcorrection? Also - could you describe how automated lambda selection works (e.g. when set lambda=NULL)? I've tried to understand this from the code, but am finding it difficult. The
harmony_options()
mentions use oftau
but I also cannot find this in the code -- is this used in lambda selection or is it an alternate method? Thank you!