Closed tomerk closed 8 years ago
Can we do some refactoring of the Sparse and Dense LBFGS to avoid code duplication ?
Sure! Any suggestions of how you want it to be done / what's the code that's actually being duplicated? (I'm not super familiar with the details of this, and don't want to accidentally introduce a bug by making two code paths that do different things instead do the same thing).
The duplicated code is in SparseLBFGS.scala and LBFGS.scala. We should be able to pass in a Sparse gradient or dense gradient and unify parts of this code.
Okay @shivaram I've done a whole bunch of refactoring. Let me know what you think. Also, a suggestion for a test for the Sparse Least Squares LBFGS would be helpful.
Okay @shivaram how's this look now?
@shivaram I've given this some more thought and I think the right thing to do might be to have the auto-optimized linear solver be able to choose between sparse and dense options. That way we won't have to deal with the extra type complexity involved in unifying the sparse and dense stuff for now.
Hmm... I think @shivaram's point is about sharing redundant code and minimizing interface surface. Am I right?
I agree that we should dynamically switch implementations, but let's think through what would need to happen to unify sparse/dense LBFGS?
1) Gradient would take a Vector and could pattern match at runtime on the data if it were sparse/dense and do the right thing. 2) LinearMapper would take a Vector and could do the same.
Note that this is a different option than type parameters in the run method
On Wed, Jan 20, 2016 at 12:35 PM, Tomer Kaftan notifications@github.com wrote:
@shivaram https://github.com/shivaram I've given this some more thought and I think the right thing to do might be to have the auto-optimized linear solver be able to choose between sparse and dense options. That way we won't have to deal with the extra type complexity involved in unifying the sparse and dense stuff for now.
— Reply to this email directly or view it on GitHub https://github.com/amplab/keystone/pull/192#issuecomment-173351505.
Right, but I'm not convinced the remaining stuff is that redundant now that the LBFGS running itself is shared for both sparse and dense gradients. The remaining stuff is distinct code pathways for the sparse and dense cases. E.g. look at how MLlib has two code pathways in many places for sparse vs. dense, which does minimize the interface surface but arguably adds internal complexity.
In this case we would still need two pathways within the gradient for both sparse and dense, we would need to introduce a sparse pathway to StandardScaler, and we would need logic to choose which normalization to do before running LBFGS and which linear mapper to use.
If we're just looking to minimize the interface surface then the above approach of letting the auto-optimized linear solver decide would be enough.
Okay @etrain and @shivaram, I've added intercept fitting options to both sparse and dense LBFGS, and made the regularization not include the intercept when it's being fit.
Okay @shivaram I fixed the regularization bug and am checking the learned intercepts in the unit tests
I had one minor comment about the local least squares estimator. otherwise change LGTM
@shivaram added a new test to check the local least squares estimator
LGTM. @etrain can you take a final pass and merge ?
Will do.
There were a few style issues related to copy/pasting code from MLlib as the basis for this (modified) least squares solver. Once that stuff goes away this seems reasonable to me.
@etrain removed
LGTM - Thanks @tomerk and @shivaram !
issue #182