amplab / keystone

Simplifying robust end-to-end machine learning on Apache Spark.
http://keystone-ml.org/
Apache License 2.0
470 stars 117 forks source link

moved over shiv's least squares solvers #192

Closed tomerk closed 8 years ago

tomerk commented 8 years ago

issue #182

shivaram commented 8 years ago

Can we do some refactoring of the Sparse and Dense LBFGS to avoid code duplication ?

tomerk commented 8 years ago

Sure! Any suggestions of how you want it to be done / what's the code that's actually being duplicated? (I'm not super familiar with the details of this, and don't want to accidentally introduce a bug by making two code paths that do different things instead do the same thing).

shivaram commented 8 years ago

The duplicated code is in SparseLBFGS.scala and LBFGS.scala. We should be able to pass in a Sparse gradient or dense gradient and unify parts of this code.

tomerk commented 8 years ago

Okay @shivaram I've done a whole bunch of refactoring. Let me know what you think. Also, a suggestion for a test for the Sparse Least Squares LBFGS would be helpful.

tomerk commented 8 years ago

Okay @shivaram how's this look now?

tomerk commented 8 years ago

@shivaram I've given this some more thought and I think the right thing to do might be to have the auto-optimized linear solver be able to choose between sparse and dense options. That way we won't have to deal with the extra type complexity involved in unifying the sparse and dense stuff for now.

etrain commented 8 years ago

Hmm... I think @shivaram's point is about sharing redundant code and minimizing interface surface. Am I right?

I agree that we should dynamically switch implementations, but let's think through what would need to happen to unify sparse/dense LBFGS?

1) Gradient would take a Vector and could pattern match at runtime on the data if it were sparse/dense and do the right thing. 2) LinearMapper would take a Vector and could do the same.

Note that this is a different option than type parameters in the run method

On Wed, Jan 20, 2016 at 12:35 PM, Tomer Kaftan notifications@github.com wrote:

@shivaram https://github.com/shivaram I've given this some more thought and I think the right thing to do might be to have the auto-optimized linear solver be able to choose between sparse and dense options. That way we won't have to deal with the extra type complexity involved in unifying the sparse and dense stuff for now.

— Reply to this email directly or view it on GitHub https://github.com/amplab/keystone/pull/192#issuecomment-173351505.

tomerk commented 8 years ago

Right, but I'm not convinced the remaining stuff is that redundant now that the LBFGS running itself is shared for both sparse and dense gradients. The remaining stuff is distinct code pathways for the sparse and dense cases. E.g. look at how MLlib has two code pathways in many places for sparse vs. dense, which does minimize the interface surface but arguably adds internal complexity.

In this case we would still need two pathways within the gradient for both sparse and dense, we would need to introduce a sparse pathway to StandardScaler, and we would need logic to choose which normalization to do before running LBFGS and which linear mapper to use.

If we're just looking to minimize the interface surface then the above approach of letting the auto-optimized linear solver decide would be enough.

tomerk commented 8 years ago

Okay @etrain and @shivaram, I've added intercept fitting options to both sparse and dense LBFGS, and made the regularization not include the intercept when it's being fit.

tomerk commented 8 years ago

Okay @shivaram I fixed the regularization bug and am checking the learned intercepts in the unit tests

shivaram commented 8 years ago

I had one minor comment about the local least squares estimator. otherwise change LGTM

tomerk commented 8 years ago

@shivaram added a new test to check the local least squares estimator

shivaram commented 8 years ago

LGTM. @etrain can you take a final pass and merge ?

etrain commented 8 years ago

Will do.

etrain commented 8 years ago

There were a few style issues related to copy/pasting code from MLlib as the basis for this (modified) least squares solver. Once that stuff goes away this seems reasonable to me.

tomerk commented 8 years ago

@etrain removed

etrain commented 8 years ago

LGTM - Thanks @tomerk and @shivaram !