argmin-rs / argmin

Numerical optimization in pure Rust
http://argmin-rs.org
Apache License 2.0
972 stars 76 forks source link

LBFGS with constraints on x #357

Open fedemagnani opened 1 year ago

fedemagnani commented 1 year ago

First of all I want to say thanks to all the contributors of this crate, you've done an amazing job.

I'm getting started with argmin running some basic examples. I wanted to run a simple constrained optimization problem finding the minimum of a halfine (y=x with x>=0) starting from the origin using LBFGS as solver. Of course the solution to this problem is x=0 and the minimum is zero. I've implemented the CostFunction and Gradient traits as follows:

    #[derive(Clone)]
    struct Problem{
        x: Arc<Mutex<Floating>>
    }
    impl Problem {
        fn new() -> Self {
            Self {
                x: Arc::new(Mutex::new(5.0))
            }
        }
        fn f(&self, x: Floating) -> Floating {
            let mut changeable_x = self.x.lock().unwrap();
            *changeable_x = x.clone();
            println!("x:{:?}",x);
            if x<self.x_lower_bound() {
                return self.sup()
            }
            return x
        }
        fn grad_f(&self, x: Floating) -> Vec<Floating> {
            // vec![2.0 * x]
            // vec![2.0*(x-4.0)]
            if x<=self.x_lower_bound() {
                return vec![0.0]
            }
            return vec![1.0]
        }
        fn x_lower_bound(&self) -> Floating {
            0.0
        }

        fn sup (&self) -> Floating {
            100.0
        }
    }

    impl CostFunction for Problem {
        type Param = Vec<Floating>;
        type Output = Floating;
        fn cost(&self, x: &Self::Param) -> Result<Self::Output, Error> {
            let output = self.f(x[0]);
            println!("image: {:?}",output);
            Ok(output)
            // Ok(output)
        }
    }

    impl Gradient for Problem {
        type Param = Vec<Floating>;
        type Gradient = Vec<Floating>;

        fn gradient(&self, x: &Self::Param) -> Result<Self::Gradient, Error> {
            let output = self.grad_f(x[0]);
            println!("gradient: {:?}",output);
            Ok(output)
        }
    }

Where Floating is a type alias for f64. I've tried to incorporate the constraint x>=0 returning 100.0 in case x>0 (I tried returning f64::INFINITY but the code used to panic with error MoreThuenteLineSearch: NaN or Inf encountered during iteration).

This is the code for solving the problem:

    #[test]
    fn test_min() {
        let linesearch: MoreThuenteLineSearch<Vec<f64>, Vec<f64>, f64> = MoreThuenteLineSearch::new();
        let initial_guess: Vec<f64> = vec![10.0; 1];
        let lbfgs: LBFGS<MoreThuenteLineSearch<Vec<f64>, Vec<f64>, f64>, Vec<f64>, Vec<f64>, f64>= LBFGS::new(linesearch, 10); 

        let cost = Problem::new();

        let res = Executor::new(cost.clone(), lbfgs)
            .configure(|state| state.param(initial_guess).max_iters(100))
            .add_observer(SlogLogger::term(), ObserverMode::Always)
            .run()
            .unwrap();

        println!("Solution: {:?}", res.state);
    }

By setting a number of iterations equal to 1, it solves the problem as follows:

running 1 test
x:10.0
image: 10.0
gradient: [1.0]
x:9.0
image: 9.0
May 14 17:55:34.046gradient: [1.0]
x:5.0
 image: 5.0
gradient: [1.0]
INFOx:-11.0
 image: 100.0
gradient: [0.0]
L-BFGSx:-5.5600000000000005
image: 100.0
gradient: [0.0]
, x:4.814681808522201
image: 4.814681808522201
max_itersgradient: [1.0]
:x:-2.0326081851024504
image: 100.0
 gradient: [0.0]
x:0.29547041272993013
1image: 0.29547041272993013
gradient: [1.0]

x:0.29547041272993013
image: 0.29547041272993013
gradient: [1.0]
gradient: [1.0]
May 14 17:55:34.057 INFO iter: 0, cost: 0.29547041272993013, best_cost: 0.29547041272993013, gradient_count: 10, cost_count: 9, time: 0.0106955, gamma: 1
Solution: IterState { param: Some([0.29547041272993013]), prev_param: None, best_param: Some([0.29547041272993013]), prev_best_param: Some([10.0]), cost: 0.29547041272993013, prev_cost: 10.0, best_cost: 0.29547041272993013, prev_best_cost: 10.0, target_cost: -inf, grad: Some([1.0]), prev_grad: None, hessian: None, prev_hessian: None, inv_hessian: None, prev_inv_hessian: None, jacobian: None, prev_jacobian: None, iter: 1, last_best_iter: 0, max_iters: 1, counts: {"cost_count": 9, "gradient_count": 10}, time: Some(12.3175ms), termination_status: Terminated(MaxItersReached) }

And of course the solution is close to zero if not exactly zero. However, by setting a number of iterations >1, the code loops endlessly reamaining stuck at x=NaN, objective=NaN and gradient=1

x:NaN
image: NaN
gradient: [1.0]

Any thoughts?

stefan-k commented 1 year ago

Apologies for the very very late reply! I have been very busy over the last couple of months and my access to the internet is currently limited (#360).

Thanks for this detailed and helpful report. Unfortunately I wasn't able to test this myself yet. As a first guess I think that the problem is that you set the gradient 0 whenever the solver it outside of the feasible range. Could you try to change your setup such that the gradient increases the further away it is from the feasible range?