Open cicdw opened 7 years ago
As we discussed earlier, I think this is a really cool idea, and I'm glad to be part of the discussion.
As a novice to this (and for the purposes of furthering a discussion), do you know any good surveys of what the academically well-grounded
things look like and/or some higher-level discussions of the benefits of Smart Initialization™?
No surveys that I know of unfortunately, but here's a list off the top of my head:
Ultimately, I think the biggest bang will come from smart initializations when refitting a model, but I'd like to include at least a little thought on initializations from scratch as well.
An all-too-often ignored side of optimization is the initialization; there is a lot of research out there suggesting that for both convex (and even more so for non-convex) optimization problems, a large amount of work can be saved by initializing algorithms at clever starting values.
Currently we are initializing all algorithms with the 0 vector. Once the API (https://github.com/dask/dask-glm/issues/11) is sorted out, we should have multiple options for how to initialize, including (but not limited to):
refit
method, to be raised in a future issue)cc: @mpancia