scalanlp / breeze

Breeze is/was a numerical processing library for Scala.
https://www.scalanlp.org
Apache License 2.0
3.45k stars 693 forks source link

Implement the L-BFGS-B algorithm #323

Closed tartakynov closed 9 years ago

tartakynov commented 9 years ago

In a real-world when you are solving an optimization problem it is often required to specify a bound constrains on variables, so it would be nice to implement L-BFGS-B algorithm, i.e. extend L-BFGS to handle simple box constrains on variables

dlwh commented 9 years ago

It would be nice to implement LBFGS-B, but in the meantime, take a look at our ProjectedQuasiNewton optimizer, which can handle arbitrary constraints, not just box constraints.

-- David

On Thu, Nov 6, 2014 at 9:28 AM, Artem Tartakynov notifications@github.com wrote:

In a real-world when you are solving an optimization problem it is often required to specify a bound constrains on variables, so it would be nice to implement L-BFGS-B algorithm, i.e. extend L-BFGS to handle simple box constrains on variables

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323.

tartakynov commented 9 years ago

Thank you for reply. Are there any examples of usage of this optimizer? Can I use it with ApproximateGradientFunction? I have reviewed the tests, but it's not obvious how to specify constraints

dlwh commented 9 years ago

Best I can do is the unit tests: https://github.com/scalanlp/breeze/blob/master/math/src/test/scala/breeze/optimize/ProjectedQuasiNewtonTest.scala

Basically it's like a normal optimization package, except you also give it a function T=>T (where, e.g. T = DenseVector[Double]) that enforces the constraints by projecting a point to the closest point that it's in the feasible set. So box constraints can just be done by clamping each element.

-- David

On Thu, Nov 6, 2014 at 10:45 AM, Artem Tartakynov notifications@github.com wrote:

Thank you for reply. Are there any examples of usage of this optimizer? Can I use it with ApproximateGradientFunction?

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323#issuecomment-62029341.

dlwh commented 9 years ago

If anyone wants to implement this: https://github.com/PatWie/CppNumericalSolvers/blob/master/src/LbfgsbSolver.cpp

dreamquster commented 9 years ago

I implement this method.But I have some question?

  1. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException.That's why? 2.In FirstOrderMinimizer.State.convergedReason if (!fVals.isEmpty && (adjustedValue - fVals.max).abs <= tolerance * initialAdjVal) Some(FirstOrderMinimizer.FunctionValuesConverged) If init x made initialAdjVal is bigger enough, the FirstOrderMinimizer may exit quickly without reaching minimal value?
  2. What's wrong with this code style? when x and dirk is DenseVector[Double] x = x + dirk:_2 is not equal to x += dirk:_2
dlwh commented 9 years ago

First, thanks so much for taking this on!

I'm not sure I totally understand all your questions.

On Wed, Mar 25, 2015 at 12:38 AM, dreamquster notifications@github.com wrote:

I implement this method.But I have some question?

  1. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException.That's why? 2.In FirstOrderMinimizer.State.convergedReason if (!fVals.isEmpty && (adjustedValue - fVals.max).abs <= tolerance * initialAdjVal) Some(FirstOrderMinimizer.FunctionValuesConverged) If init x made initialAdjVal is bigger enough, the FirstOrderMinimizer may exit quickly without reaching minimal value?

Yeah, we're working on changing the convergence criteria.

  1. What's wrong with this code style? when x and dirk is DenseVector[Double] x = x + dirk:_2 is not equal to x += dirk:_2

Those ought to be the same? Can you give an example?

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323#issuecomment-85899796.

dlwh commented 9 years ago

Hit send too soon:

On Wed, Mar 25, 2015 at 12:38 AM, dreamquster notifications@github.com wrote:

I implement this method.But I have some question?

  1. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException.That's why?

I don't understand what you mean here? If it can't find stepsize that satisfies the wolfe conditions, it will throw.

On Wed, Mar 25, 2015 at 10:39 AM, David Hall david.lw.hall@gmail.com wrote:

First, thanks so much for taking this on!

I'm not sure I totally understand all your questions.

On Wed, Mar 25, 2015 at 12:38 AM, dreamquster notifications@github.com wrote:

I implement this method.But I have some question?

  1. If StrongWolfeLineSearch(maxZoomIter,maxLineSearchIter) is small, the wolfeRuleSearch.minimize may throw FirstOrderException.That's why? 2.In FirstOrderMinimizer.State.convergedReason if (!fVals.isEmpty && (adjustedValue - fVals.max).abs <= tolerance * initialAdjVal) Some(FirstOrderMinimizer.FunctionValuesConverged) If init x made initialAdjVal is bigger enough, the FirstOrderMinimizer may exit quickly without reaching minimal value?

Yeah, we're working on changing the convergence criteria.

  1. What's wrong with this code style? when x and dirk is DenseVector[Double] x = x + dirk:_2 is not equal to x += dirk:_2

Those ought to be the same? Can you give an example?

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323#issuecomment-85899796.

dreamquster commented 9 years ago

The following code failed.

object VectorTest {
  def main(args:Array[String]) = {
    var x = DenseVector[Double](2.0, 3.0)
    var xcopy = x.copy
    val dir = DenseVector[Double](1.0, 0.0)
    x = x + dir:*0.5
    xcopy += dir:*0.5
    assert(x == xcopy)
  }
}

Because operator :* priority is less than +. Compiler translate x + dir :_0.5 into (x+dir):_0.5

dlwh commented 9 years ago

ah right, you're getting bitten by operator precedence. operators starting with : have low priority in Scala. Not much I can do about it, besides pick a different operator. You can just use * for multiplication by scalars.

On Wed, Mar 25, 2015 at 7:35 PM, dreamquster notifications@github.com wrote:

The following code failed.

object VectorTest { def main(args:Array[String]) = { var x = DenseVector[Double](2.0, 3.0) var xcopy = x.copy val dir = DenseVector[Double](1.0, 0.0) x = x + dir:_0.5 xcopy += dir:_0.5 assert(x == xcopy) } }

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323#issuecomment-86303521.

dreamquster commented 9 years ago

Ok, thanks.

dreamquster commented 9 years ago

How to create a new pull request? Why this changed is merged into "numpy parity: financial functions #128" pull?

dlwh commented 9 years ago

pull requests are by branch, so every time you push to a branch, the PR gets updated.

On Mon, Mar 30, 2015 at 8:04 AM, dreamquster notifications@github.com wrote:

How to create a new pull request? Why this changed is merged into "numpy parity: financial functions #128 https://github.com/scalanlp/breeze/issues/128" pull?

— Reply to this email directly or view it on GitHub https://github.com/scalanlp/breeze/issues/323#issuecomment-87715683.

dlwh commented 9 years ago

(Thanks @dreamquster ! Cleaned up the implementation to be a bit more pure and added a test. Seems to work great!)