argmin-rs / argmin

Numerical optimization in pure Rust
http://argmin-rs.org
Apache License 2.0
973 stars 76 forks source link

BFGS with nalgebra Vector2 #337

Open amfaber opened 1 year ago

amfaber commented 1 year ago

Hi, amazing crate! As someone looking to get further into numerical optimization, just the fact that all of these algorithms are gathered in one place, with trait bounds explaining what each algorithm needs is just amazing, so thanks for that :)

I've been trying to get the BFGS example to work on my system and have rewritten it in an attempt to have it work on nalgebra's Vector2 (importantly not DVector, as I would like the performance benefits of stack allocating my params). This is my current code

use argmin::{
    core::{CostFunction, Executor, Error, Gradient},
    solver::{
        neldermead::NelderMead,
        quasinewton::BFGS,
        linesearch::MoreThuenteLineSearch,
    },
};
use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative};
use nalgebra::{Vector2, Matrix2};

struct Rosenbrock{
    a: f64,
    b: f64,
}

impl CostFunction for Rosenbrock{
    type Param = Vector2<f64>;

    type Output = f64;

    fn cost(&self, param: &Self::Param) -> Result<Self::Output, Error> {
        Ok(rosenbrock_2d(param.as_slice(), self.a, self.b))
    }
}

impl Gradient for Rosenbrock{
    type Param = Vector2<f64>;

    type Gradient = Vector2<f64>;

    fn gradient(&self, param: &Self::Param) -> Result<Self::Gradient, Error> {
        Ok(Vector2::from_iterator(rosenbrock_2d_derivative(param.as_slice(), self.a, self.b).into_iter()))
    }
}

fn bfgs(){
    let cost = Rosenbrock{ a: 1.0, b: 100.0 };
    let line = MoreThuenteLineSearch::new().with_c(1e-4, 0.9).unwrap();
    let solver = BFGS::new(line);
    let init_param = Vector2::new(1.2, 1.0);
    let init_hess = Matrix2::<f32>::identity();
    let res = Executor::new(cost, solver)
        .configure(|state| 
            state
                .param(init_param)
                .inv_hess(init_hess)
                .max_iters(60))
        .run()
        .unwrap();
}

However, I'm getting the following compiler error

error[E0277]: the trait bound `f64: argmin_math::ArgminEye` is not satisfied
  --> gauss_fitting\src\main.rs:58:35
   |
58 |     let res = Executor::new(cost, solver)
   |               -------------       ^^^^^^ the trait `argmin_math::ArgminEye` is not implemented for `f64`
   |               |
   |               required by a bound introduced by this call
   |
   = help: the following other types implement trait `argmin_math::ArgminEye`:
             Matrix<N, R, C, <DefaultAllocator as nalgebra::allocator::Allocator<N, R, C>>::Buffer>
             Vec<Vec<f32>>
             Vec<Vec<f64>>
             Vec<Vec<i16>>
             Vec<Vec<i32>>
             Vec<Vec<i64>>
             Vec<Vec<i8>>
             Vec<Vec<isize>>
           and 17 others
   = note: required for `BFGS<MoreThuenteLineSearch<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, f64>, f64>` to implement `Solver<Rosenbrock, IterState<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, (), f64, f64>>`
note: required by a bound in `Executor::<O, S, I>::new`
  --> C:\Users\andre\.cargo\registry\src\github.com-1ecc6299db9ec823\argmin-0.8.1\src\core\executor.rs:38:8
   |
38 |     S: Solver<O, I>,
   |        ^^^^^^^^^^^^ required by this bound in `Executor::<O, S, I>::new`

error[E0599]: the method `configure` exists for struct `Executor<Rosenbrock, BFGS<MoreThuenteLineSearch<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, f64>, f64>, IterState<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, (), f64, f64>>`, but its trait bounds were not satisfied
  --> gauss_fitting\src\main.rs:59:10
   |
59 |         .configure(|state|
   |          ^^^^^^^^^ method cannot be called due to unsatisfied trait bounds
   |
  ::: C:\Users\andre\.cargo\registry\src\github.com-1ecc6299db9ec823\argmin-0.8.1\src\solver\quasinewton\bfgs.rs:51:1
   |
51 | pub struct BFGS<L, F> {
   | --------------------- doesn't satisfy `_: Solver<Rosenbrock, IterState<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, (), f64, f64>>`
   |
   = note: the following trait bounds were not satisfied:
           `BFGS<MoreThuenteLineSearch<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, f64>, f64>: Solver<Rosenbrock, IterState<Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, Matrix<f64, Const<2>, Const<1>, ArrayStorage<f64, 2, 1>>, (), f64, f64>>`

It's unclear to me why ArgminEye should be required for f64 in this case? What can I do to get BFGS to work with the statically sized types of ndalgebra for that sweet, sweet stack allocation?

Once again, thanks for the amazing work and the time to help me with my issue :)

stefan-k commented 1 year ago

Hi,

apologies for the late reply! Thanks for the kind words, I am always happy to hear when people find argmin useful :)

Admittedly, I'm not an expert on nalgebra. Most work on the nalgebra backend was done by collaborators, but from a first glance most traits were implemented for the seemingly most general Matrix type, therefore SVector and SMatrix should be covered. If not, I consider this a bug. Some are implemented for OMatrix and I'm unsure if that covers the statically allocated arrays as well.

The error message you get does seem a bit strange though. I'm not sure if this is related, but there are some inconsistencies regarding the use of f32 and f64 which may cause problems, in particular:

    type Param = Vector2<f64>;
    type Output = f64;

and

    let init_hess = Matrix2::<f32>::identity();

It would be interesting to see if changing the latter to Matrix2::<f64>::identity() has an effect on the error message.

EDIT: Also, this should be inv_hessian(init_hess):

                .inv_hess(init_hess)
amfaber commented 1 year ago

Thanks for that, hadn't spotted those mistakes from messing around trying some different things. Unfortunately, the issue arises before the hessian is even checked for trait bounds. I can comment out the lines defining init_hess and .inv_hessian(...) and the compiler error still persists. Actually the error still shows if I just shorten the last line to

let res = Executor::new(cost, solver);
stefan-k commented 1 year ago

I did some detective work but just ended up being confused.

The type of the Hessian is inferred by whatever one passes into inv_hessian of IterState (inside the closure passed to configure). However, when Vector2 is used, the Hessian type H is for some reason inferred as f64, even without calling configure (as you stated before). I wasn't able to figure out why yet.

Interestingly, I don't even get it to work with DVector, but I think that's a different and probably unrelated problem.

I will have to do a more in-depth investigation of this, but I'm afraid that this may take a while because my time currently is pretty limited. I will do my best though. If you or someone else wants to have a go at this, feel free to do so.

Thanks for reporting this issue! :)

amfaber commented 1 year ago

It did feel like something was fundamentally off. Thanks for looking into it! My problem is actually just least squares fitting, so I'll probably end up going with a Gauss-Newton like method. Although I have an awful lot of spots to fit, so I might start looking into implementing one of the algorithms on a GPU with wgpu instead. Once again thanks for the very readable source files :))

itrumper commented 1 year ago

To possibly add to the confusion (sorry!) I am using na::Vector3<f64> as my parameter vector and everything is happy.

Typically, when I've had trait bound issues with nalgebra, its due to conflicting versions. You need to be sure to match the argmin-math dependency version with your crate's dependency version. What version of nalgebra do you have in your Cargo.toml?

For example, I have:

[dependencies]
nalgebra = { version = "0.30" }
argmin = { version = "0.8" }
argmin-math = { version = "0.3", features = ["nalgebra_v0_30-serde"] }

Hope this help!

stefan-k commented 7 months ago

I am running into the same problem with Vector3. I think I have boiled it down to this constraint in the Solver impl of BFGS:

    P: /* ... */ + ArgminDot<G, H> + ArgminDot<P, H>,

In the nalgebra case, ArgminDot is implemented for the case type(<vec1, vec2>) = scalar, here:

https://github.com/argmin-rs/argmin/blob/e9bebb21d99d2ccad1c36d7373e7f4f53eec1539/crates/argmin-math/src/nalgebra_m/dot.rs#L41C1-L53C2

and type(<vec1, vec2>) = matrix{dim1 = dim1(vec1), dim2 = dim2(vec2)}:

https://github.com/argmin-rs/argmin/blob/e9bebb21d99d2ccad1c36d7373e7f4f53eec1539/crates/argmin-math/src/nalgebra_m/dot.rs#L69C1-L86C2

The latter is the one that should be used, but if both vectors are either row- or column-vectors, it resorts to the former. In the ndarray impl, it just assumes that the first is a row- and the second a column-vector when a matrix is requested as output.

I'm surprised it works for DVector though.

I'm not that familiar with nalgebra unfortunately. Could anyone with nalgebra experience help?

troiganto commented 6 months ago

In the nalgebra case, ArgminDot is implemented for the case type(<vec1, vec2>) = scalar, here:

https://github.com/argmin-rs/argmin/blob/e9bebb21d99d2ccad1c36d7373e7f4f53eec1539/crates/argmin-math/src/nalgebra_m/dot.rs#L41C1-L53C2

is this right? The signature is:

impl<...> ArgminDot<N, OMatrix<...>> for Matrix<...>

so doesn't this implement type(vec * scalar) = vec{dim = dim(vec)}, i.e. multiplication with a scalar?

stefan-k commented 6 months ago

Oh yes that's correct, I must have meant this part: https://github.com/argmin-rs/argmin/blob/e9bebb21d99d2ccad1c36d7373e7f4f53eec1539/crates/argmin-math/src/nalgebra_m/dot.rs#L23-L39