Closed benkeser closed 3 years ago
Thanks for the feedback. I have been debating changing stop
to warning
. The original reason was we often only observed the weights as all 0 if something went wrong and therefore the the error was helpful. Since there is no intercept in that method, what is the preference for the predicted values? A vector of NA or a vector of 0?
Interesting. I was getting 0 weights but no convergence errors. I'm not very familiar with the NNLS algorithms to know if e.g., changing starting values would've moved weights away from 0 (i.e., are local minima possible? I don't know).
A sensible thing might be to look at the convergence criteria reported back by the NNLS routine and only warn
/stop
if not converged.
On a related note, it could be helpful to add some error catching into the weight finding steps. E.g., try
to find weights and if it breaks, do something sane like giving all weight to the discrete super learner. This would help with the painful runs where you spend a long time training learners, only to have things break in the ensemble step.
Are you observing this on a simulated dataset or a real data set with a sensible set of algorithms?
This is easy to reproduce with a dataset absent of any signal and with a true population mean of zero (can add a predictor to always predict the true population mean and that should get a non-zero weight, but that isn’t realistic), can also reproduce with predictors negatively correlated with the outcome. The Algorithm May report it converged, but these are somewhat pathological. I agree we can remove the error and replace with a warning, but I’m curious if you are observing this outside of these edge cases.
On Mar 4, 2020, at 2:36 PM, David Benkeser notifications@github.com wrote:
Interesting. I was getting 0 weights but no convergence errors. I'm not very familiar with the NNLS algorithms to know if e.g., changing starting values would've moved weights away from 0 (i.e., are local minima possible? I don't know).
A sensible thing might be to look at the convergence criteria reported back by the NNLS routine and only warn/stop is not converged.
On a related note, it could be helpful to add some error catching into the weight finding steps. E.g., try to find weights and if it breaks, do something sane like giving all weight to the discrete super learner. This would help with the painful runs where you spend a long time training learners, only to have things break in the ensemble step.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
The code has been updated to give a warning instead of an error and return predictions of always 0 if all coefficients are 0.
I just had an analysis, where I was getting errors thrown from
predict.SuperLearner
about all the weights being 0, so that it couldn't predict. After debugging, I found the error is apparently thrown by these lines in thecomputePred
function ofmethod.NNLS
.I think you all should consider removing this flag. If the NNLS algorithm returns all 0 weights, well, then, weighting all algorithms with 0 weight leads to the best cross-validated risk and the resultant super learner predictions should respect that.
In general, there is a little dissonance in the NNLS methods, as they're doing the non-negative least squares minimization but then re-weighting to a convex combination. So I see why the temptation is there to throw an error since you can't make a convex combination out of all 0's.
I would personally rather see the
CC
methods get moved to the default and have theNNLS
methods not even try to do the re-weighting to a convex combination. This would be more in line with the theory that states that the ensemble is minimizing cross-validated risk over some class of ensemble estimators (theCC
methods over convex weights, theNNLS
methods over non-negative weights).Just wanted to get the discussion going. Feel free to close the issue and follow up offline if you'd like.