antonior92 / ip-nonlinear-solver

A trust-region interior-point method for general nonlinear programing problems (GSoC 2017).
BSD 3-Clause "New" or "Revised" License
36 stars 12 forks source link

Project decisions #5

Closed antonior92 closed 7 years ago

antonior92 commented 7 years ago

I think it is a good time for us to start discussing the solver interface. My general impression is that trying to fit the solver we are implementing into the minimize interface may not be practical. What do you guys think about creating a separate solver?

I see the minimize interface as very appropriate for the needs of unconstrained problems, but the way to access the two constrained solvers (SLSQP and COBLYA) seems a little clumsy to me. And I think it would be much worse for our solver.

If we go for a different interface we need to think how the parameters are going to be passed by the user.


I see two possible ways to specify constraints:

One option is to formulate the problem as: (Option number 1)

min f(x)
subject to: h(x) = 0
            g(x) >= 0
            lb <= x <= ub

for callables f, g and g.

Another option is to formulate the problem as: (Option number 2)

min f(x)
subject to: h(x) = 0
         clb  <= c(x) <= cub
         lb <= x <= ub

I have seen this interface on several commercial solvers and I like it a lot. While is not usually how optimisation problems are described on text books it is very intuitive, and allow us to formulate constrained problems in a very easy way.

nmayorov commented 7 years ago

I think it would be much cleaner to define constraints as vectrorized functions, so I do support introducing a new interface.

I also see 2 sub-approaches:

  1. Introduce a general function minimize_constrained (or something along the lines) which will serve as a general interface for future possible implementations.
  2. Or just introduce a specific function minimize_ip (or something like that) which will implement your solver as conveniently as possible.

I think there are merits for both approaches, probably 1 is better if done appropriately (but needs more considerations). I think you should ask on the mailing list (probably specifically mentioning Pauli Virtanen) to check that it won't be objected strongly by others.


As for constraints specification: can you explain why do you prefer the second one? For me it also looks more convenient, especially as we can set clb, cub, lb, ub to +-np.inf by default, so a user needs to set only some of them. But the first approach is certainly not bad as well, it is clean and canonical.

rgommers commented 7 years ago

Good idea to start discussing this now.

I see the minimize interface as very appropriate for the needs of unconstrained problems, but the way to access the two constrained solvers (SLSQP and COBLYA) seems a little clumsy to me. And I think it would be much worse for our solver.

It would probably be helpful for the discussion to illustrate this better, e.g. by using a concrete example with a function and some constraints and then showing what the minimize function call and your two alternatives look like in code.

antonior92 commented 7 years ago

Thank both of you for the feedback.

About the new interface: I like Nikolay`s two possible approaches (minimize_constrained vs minimize_ip). The first one seems the better option, but as you said it will require more consideration. I will send an email to the mailing list as you suggested.


About the way constraints specifications are defined: I prefer the option number 2 because it seems intuitive, it can be used to do anything the first one is able to do (just set clb=0 and cub=inf). The reverse is also true, but it usually require more modifications from the user. And at last, the interface is used in production solvers (e.g. IPOPT) in test problems (from CUTEst) and in optimisation libraries (e.g. juMP), so there is precedents for using it.

The alternative (option number 1) is also a pretty good one. Matlab optimisation library uses it and I don't think no one will actually have any actual trouble with it.

I will write the functions signatures as Ralf suggested.

mdhaber commented 7 years ago

Antonio, I had a bit of trouble following this last post. You refer to "first one", "second option", and "alternative", but it's not always clear whether you mean first/second of your problem formulations or first/second of Nikolay's sub-approaches.

I think I got most of it, but I'm not sure I followed the bit about the "alternative". Here is Matlab's fmincon documentation. I don't see how it's the same as any "alternative" mentioned here. For example, it accepts linear and nonlinear constraints separately.

I agree that the interface for minimize for constrained problems is clumsy. But I don't see how it will be any clumsier for the interior point code than it is for SLSQP, as that solver can handle the the fully general problem, too.

For what it's worth:

But of course, as Ralf suggested, a discussion on the mailing list is a appropriate; this is just my two cents.

Please note that in addition to writing out possible function signatures, I believe Ralf was asking for specific examples of calling those functions, complete with definitions for any objective and constraint functions.

mdhaber commented 7 years ago

BTW, Antonio, I absolutely agree that the problem is most naturally expressed as:

min f(x)
subject to: h(x) = 0
         clb  <= c(x) <= cub
         lb <= x <= ub

(possibly with the addition of separate linear constraints, which I would recommend for efficiency) and that a corresponding signature would be more convenient for users. While it may require specification of more parameters, I think users would find that easier than rearranging equations.

And maybe convenience is more important than familiarity. I suppose it's OK for your solver to be better than Matlab's : )

P.S. I think I've figured it all out now : ) By "alternative", you meant the first problem formulation you (Antonio) suggested, which is indeed what Matlab does if one ignores the direction of the inequality constraint and the separation of linear constraints.

antonior92 commented 7 years ago

Sorry about the confusing post Matt :) I edited the post in order to try to clarify things up.

Using separate linear constraints seems like a good idea. The greatest advantage of doing it is that we could use some preprocessing for those constraints, maybe we could even reuse the routines you implemented for your linear interior point method (e.g. the routines from _remove_redundancy.py).

nmayorov commented 7 years ago

I suggest changing the interface for all solvers, or sticking with the current interface. I don't see the value of having a separate minimize_constrained interface, since it will likely be the same as the interface for the unconstrained solvers just with additional arguments. (It certainly shouldn't change, say, the order in which the objective function and starting point are specified....) I really don't think it is a good idea to have a separate interface just for your IP method because I don't like the idea of having to reformulate the problem in order to try the SLSQP solver. We agreed that different solvers are good for different problems, so we should make it easy for the user to try different solvers

These are all valid concerns, but there is no ideal solution seemingly. I will try to break down which problems I see.

  1. If we were to fit Antonio's solver into the current minimize. The constraint specification in it is very inconvenient (in my opinion) and is not really in spirit of scipy/numpy vectorized approach. I think it would be a real pain for Antonio to make his solver fit into this scheme. Additionally it is not even clear how to connect the approach in minimize with the vectorized approach Antonio will be using (iterating through dict and populating an array? --- doesn't look nice at all).
  2. If we were to change how constraints are defined in minimize. Then it is not backward compatible and some code of SLSQP needs to be changed (not totally trivial I suppose).

I agree that it looks like minimize_constrained doesn't make much sense, because unconstrained problems are just particular case. Does it all lead us to writing minimize_new (the name is bad) --- I don't know.

Is it minimize_ip such a bad idea after all? One obvious advantage I see is that it will be easier to design and cleanest in terms of API (i. e. it will have only necessary parameters with descriptions right in the function). The disadvantage is what Matt said --- separated from other solvers, requires problem reformulation to run 'SLSQP` (but considered what I said about SLSQP implementation, can it be really avoided?)

nmayorov commented 7 years ago

Turned out I was wrong about how SLSQP constraints must be specified --- it can be specified in vectorized fashion, so you don't need more than 2 elements in it (one for equality and inequality constraints). So it is quite compatible with the vectorized approach. Probably it should be explained in the docstring.

Considering this I have another idea: introduce Constraints class which will describe all non-linear constraints, then constraints can accept Constraints instance or the old dict, then necessary adaptations are made inside and we can run old and new solvers.

mdhaber commented 7 years ago

I didn't realize there was a concern about not being able to vectorize constraints before. That would certainly explain the objection to the current interface! The current dictionary is still clumsy, but specifying constraints one at a time would be truly horrible : )

I really like the thought of creating a Constraints class! Then, say we have

h(x) = 0
c_lb  <= c(x) <= c_ub
A_eq x = b_eq
b_lb <= A_ineq x <= b_ub
lb <= x <= ub

Of course there would be some order in which they'd appear in the Constraint __init__ signature, but the more common use case probably would be naming all the parameters, for example: all_constraints = Constraint(h = h, c = c, c_lb = c_lb, A_eq = A_eq, b_eq = b_eq, lb = lb, ub = ub) which is effectively a problem of Antonio's form (Option Number 1 - Thanks, Antonio!) with the addition of a linear equality constraint. This is just an example to demonstrate that we don't really have to choose between Option Number 1 and Option Number 2 (as Nikolay may have been alluding to in his first post) as Option Number 2 allows itself to be used as Option Number 1. Ah, and I think having this class is better than a dict because you don't have to type all those damn quotes.

And regarding separating the linear constraints - I wasn't even thinking about re-using the preprocessing routine, although that's a good idea*. I was just thinking that it saves the trouble of linearizing those constraints. (or am I showing my ignorance of the algorithm?)

* I'd need to fix it up a bit first:

antonior92 commented 7 years ago

I like the idea of introducing a Constraint class. I think it would give the flexibility I need while still using the same minimize interface.

About the linear constraints: I don't think the necessity of "linearizing" linear constraints cause any significant overhead in the implementation. I think the main advantage would be the capability of preprocessing these linear constraints.

I know that presolving schemes are usually complicated and to implement a state of the art presolver is lot of trouble. But I think maybe some preprocessing is better than nothing. I will think more about this along the next weeks

nmayorov commented 7 years ago

The current dictionary is still clumsy, but specifying constraints one at a time would be truly horrible : )

That's what I was afraid initially.

About the linear constraints: I don't think the necessity of "linearizing" linear constraints cause any significant overhead in the implementation. I think the main advantage would be the capability of preprocessing these linear constraints.

I think without preprocessing it comes down to API convenience. My guess that it's better to start without special linear constraints just to keep everything slightly simpler and decide later.

antonior92 commented 7 years ago

I think without preprocessing it comes down to API convenience. My guess that it's better to start without special linear constraints just to keep everything slightly simpler and decide later.

Sounds good to me.

antonior92 commented 7 years ago

Hey guys, working on the solver interface I just came to the conclusion that the constraints interface:

h(x) = 0
c_lb  <= c(x) <= c_ub
A_eq x = b_eq
b_lb <= A_ineq x <= b_ub
lb <= x <= ub

while convenient to use, can also be a major performance killer. That is, it is hard to parse the constraints in such way that I don't end up doing some wasteful slicing in sparse matrices (for some situations).

Anyway, I am deciding to go for the matlab interface:

minimize fun(x)
subject to: constr(x) <= 0
            constr_eq(x) = 0
            A x <= b
            A_eq x = b_eq
            lb <= x <= ub

While not so easy to use as the previous one, I think at least the user would not experience differences in the performance depending on the way the constraints are ordered and how upper and lower bounds are set.

What do you three think about this?

mdhaber commented 7 years ago

Can't you do all the slicing once at the beginning to rearrange the problem form as needed? Surely that's not so bad compared to the overall execution time.

On Jul 6, 2017 5:46 AM, "Antonio Horta Ribeiro" notifications@github.com wrote:

Hey guys, working on the solver interface I just came to the conclusion that the constraints interface:

h(x) = 0 c_lb <= c(x) <= c_ub A_eq x = b_eq b_lb <= A_ineq x <= b_ub lb <= x <= ub

while convenient to use it is can also be a major performance killer. That is, it is hard to parse the constraints in such way that I don't end up doing some wasteful slicing in sparse matrices (for some situations).

Anyway, I am deciding to go for, the matlab interface:

minimize fun(x) subject to: constr(x) <= 0 constr_eq(x) = 0 A x <= b A_eq x = b lb <= x <= ub

While not so easy to use as the previous one, I think at least the user would not experience difference in performance depending on the way the constraints are ordered and how upper and lower bounds are set.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/antonior92/ip-nonlinear-solver/issues/5#issuecomment-313385769, or mute the thread https://github.com/notifications/unsubscribe-auth/AGRCKzzS2R2_XBbO2H2ktnrQ7ULuKQhzks5sLNcPgaJpZM4N7o95 .

mdhaber commented 7 years ago

Oops never mind. Nonlinear gotcha.

On Jul 6, 2017 7:50 AM, "Matt Haberland" mdhaber@mit.edu wrote:

Can't you do all the slicing once at the beginning to rearrange the problem form as needed? Surely that's not so bad compared to the overall execution time.

On Jul 6, 2017 5:46 AM, "Antonio Horta Ribeiro" notifications@github.com wrote:

Hey guys, working on the solver interface I just came to the conclusion that the constraints interface:

h(x) = 0 c_lb <= c(x) <= c_ub A_eq x = b_eq b_lb <= A_ineq x <= b_ub lb <= x <= ub

while convenient to use it is can also be a major performance killer. That is, it is hard to parse the constraints in such way that I don't end up doing some wasteful slicing in sparse matrices (for some situations).

Anyway, I am deciding to go for, the matlab interface:

minimize fun(x) subject to: constr(x) <= 0 constr_eq(x) = 0 A x <= b A_eq x = b lb <= x <= ub

While not so easy to use as the previous one, I think at least the user would not experience difference in performance depending on the way the constraints are ordered and how upper and lower bounds are set.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/antonior92/ip-nonlinear-solver/issues/5#issuecomment-313385769, or mute the thread https://github.com/notifications/unsubscribe-auth/AGRCKzzS2R2_XBbO2H2ktnrQ7ULuKQhzks5sLNcPgaJpZM4N7o95 .

antonior92 commented 7 years ago

The problem is that I would have to keep slicing the matrices every time I call it (as you just realised πŸ˜ƒ). And slice sparse matrices have an associated computational cost.

I am afraid of ending up with something that may have its performance affect by arbitrary reasons from the "user" viewpoint.

mdhaber commented 7 years ago

It's up to you, but I'd say up to 25% slowdown would be a fair price for convenience. Actually think of the alternative : rather than actually rewriting relevant code to adhere to constraint format more efficiently, the user might just write a wrapper around their original functions to do the same thing your code would be doing. So if they don't understand the problem, it's likely going to bite them anyway, right? Just playing devil's advocate. It's really your call. ;-)

antonior92 commented 7 years ago

You have a fair point: the user may end up having to do the same thing himself.

I will start with the matlab-like interface for now, it is easier to implement and it will allow me to do the tests I need. Latter I will rethink about this and possible change it to the other one.

mdhaber commented 7 years ago

Sure.

Also, would it be possible for you to expose the options for Augmented System/Normal Equations/QR at the top level? I am having trouble installing SuiteSparse on one of my computers. I was hoping to write some tests.

mdhaber commented 7 years ago

Never mind again. I just put the import in a try block. I guess your default doesn't use it.

antonior92 commented 7 years ago

I think this is the easiest way to proceed. The default options does not uses the sparse cholesky decomposition.

nmayorov commented 7 years ago

I think that the problem formulation should not in principle influence the performance. If inequalities have two bounds, then the final matrix will be 2 times taller and it will be formed this way at some point. I don't clearly see how it can be done in a more efficient manner, i.e. what you called slicing might be done by a user anyway (what Matt said).

antonior92 commented 7 years ago

Today I have been working on interfaces for the solver. My idea is to provide two alternative interfaces:

  1. A Matlab-like interface:

    minimize fun(x)
    subject to: constr_ineq(x) <= 0
            constr_eq(x) = 0
            A_ineq x <= b_ineq
            A_eq x = b_eq
            lb <= x <= ub
  2. And an interface equivalent to the one used by CUTEst, IPOPT, JuMP,...:

    minimize fun(x)
    subject to: c_lb <= constr(x) <= c_ub
            lb <= x <= ub

The basic idea is to create a class:

class MatlabLikeInterface

and a class:

 class CUTEstLikeInterface

and include class methods to converts from one to the other. I am working on classes for the complete problem, but latter we could adapt the implementation to the Constraint class Nikolay have suggested earlier.

antonior92 commented 7 years ago

By the way, I forgot to explain. The interface:

minimize fun(x)
subject to: c_lb <= constr(x) <= c_ub
                 lb <= x <= ub

specifies equality constraints by setting c_lb = c_ub and specifies linear constraints using an auxiliar vector.

antonior92 commented 7 years ago

After some time working on constraints specifications, I am more and more convinced that it is not a very good idea to try to fit in the constrained solver into minimize.

My main reason for that is that the minimize API would become much more complicated. Someone just wanting to solve an unconstrained optimisation problem would be confronted with an extensive documentation about constrained minimisation. That is, in order to specify the constraints the user would have to specify, either:

( fun, grad, lagr_hess, n_vars, constr, jac,
        c_lb, c_ub, lb, ub, eq_list=None)

when using the interface:

minimize fun(x)
subject to: c_lb <= constr(x) <= c_ub
                 lb <= x <= ub

or:

(fun, grad, lagr_hess, n_vars, n_eq=0, n_ineq=0, constr_ineq=None,
jac_ineq=None, constr_eq=None, jac_eq=None, A_ineq=None, b_ineq=None,
A_eq=None, b_eq=None, lb=None, ub=None)

when using the interface:

minimize fun(x)
subject to: constr_ineq(x) <= 0
            constr_eq(x) = 0
            A_ineq x <= b_ineq
            A_eq x = b_eq
            lb <= x <= ub

And all the parameters would be explained to a group of users that, in many situations, only wants to solve an unconstrained optimisation problem.

I want to provide the option of the user passing `hess_ineq, hess_eq and hess or, alternatively, to pass only hess_lagr. And I want to provide several derivatives options. And to properly illustrate the use of the function I think I would need at least 6 examples (These examples could be delegated to the tutorials, but I don't like this idea because no one reads the tutorials).

Summarising, I think the docstring of the minimize function would get much longer and much more complicated. Furthermore, I think that trying to use minimize interface would also make my life much harder.

I want to propose a new round of discussions about the idea of creating a minimize_constrained. Matt's idea that:

I don't see the value of having a separate minimize_constrained interface, since it will likely be the same as the interface for the unconstrained solvers just with additional arguments. 

is a valid one. But I think that these additional arguments may make the minimize interface documentation unnecessarily complicated while creating several difficulties when trying to fit in the constrained solver.

Furthermore, I think it would not be much trouble to include theconstrained optimisation methods SLSQP and COBLYA into the same minimize_constrained interface (I think I could manage to do it in one or two days). Furthermore, this new interface, could also be used for future constrained algorithms.

Matlab does this separation between constrained and unconstrained problems and I think they got it right. What do you three think about this?

I will present some possible signatures to this new minimize_constrained interface on future comments.

mdhaber commented 7 years ago

The "close issue" button is really easy to hit accidentally on mobile version! I've been thinking hard about this. I decided that I probably don't have anything to say that you haven't already considered, and I trust your judgment, so I'll leave it be.

rgommers commented 7 years ago

I want to provide the option of the user passing `hess_ineq,hess_eq and hess or, alternatively, to pass only hess_lagr. And I want to provide several derivatives options.

Agree that adding that many keywords to minimize is not an option. So either they have to all go in single dict, which makes the interface a bit clumsier, or minimize_constrained it is.

I suggest making the minimize_constrained interface proposal on the scipy-dev mailing list at this point. Another couple of questions to consider for that:

nmayorov commented 7 years ago

@antonior92 what about this idea to introduce Constraints class to abstract constraints specification?

antonior92 commented 7 years ago

what about this idea to introduce Constraints class to abstract constraints specification?

The Constraints class is, in my opinion, the best way of integrating the solver to minimize. However I think it would be better to take a different approach: introduce a new function minimize_constrained, the reasons for that are the following:

  1. The minimize interface would be kept simple and reserved for unconstrained problems and problems with simple bound constraints.
  2. We would have more freedom designing a minimize_constrained interface that best suit our needs.
  3. The burden of implementing and including SLSQP and COBLYA to a new interface is not much greater than the burden of fitting ipsolver into minimize.
  4. This new interface could be used for new constrained solvers included to scipy.

Let me elaborate on 1: I want to include finite-derivative options, quasi-newton options, different ways of passing constraints, different ways of passing second order information... All of that requires a great deal of documentation and the need of including several arguments to the function signature. And my point is: why to complicate minimize interface for something that will only be used by a single solver method?

The idea is to create a new interface so we keep minimize interface clean and easy to use. It would also give us more freedom for creating a better minimize_constrained interface.

I suggest making the minimize_constrained interface proposal on the scipy-dev mailing list at this point.

Ok. I will send an email to the mailing list next.

Is it a strict superset of the minimize params/keywords, or can things be left out / changed?

I would do a few changes to the existing parameters. A few modifications that I can think of:

But I think the most significant changes would be on additional parameters.

Do the current constrained optimizers in minimize now get exposed 2x, or would you like to deprecate them in minimze now or in a couple of releases?

My choice would be to deprecate them and make them available only through minimize_constrained. We could keep accepting the methods, but send a warning. And maybe remove those methods from documentations. To be coherent with versioning number maybe we could do it for scipy 1.0.

If we decide to take this approach we would have to discuss this further.

antonior92 commented 7 years ago

Just sent an email to the mailing list and created an issue in scipy: https://github.com/scipy/scipy/issues/7648

I propose to move all discussions regarding the interface to this new issue, so other people can contribute as well.

nmayorov commented 7 years ago

Hey, Antonio. Which list did you mail to? I don't see it on scipy-dev

antonior92 commented 7 years ago

Hey Nikolay! I accidentally send it to scipy-dev@scipy.org. Now I send it to the right one: scipy-dev@python.org.