ridgeworks / clpBNR

CLP(BNR) module for SWI-Prolog
MIT License
38 stars 5 forks source link

global_maximum/2 #7

Closed JeanChristopheRohner closed 3 years ago

JeanChristopheRohner commented 3 years ago

Hi! I am not sure, but is this correct?

?- [X2, X3, X4]::real(0.01, 0.99), {X1 == (X2 + X3 - X2 * X3) * X4}, global_maximum(X1, MAX).
X2::real(0.5495054587018423, 0.9900000000000001),
X1:: 1.22761...,
MAX:: 1.22761...,
X4::real(0.7295252238616355, 0.9900000000000001),
X3::real(0.5495054587018423, 0.9900000000000001).

I am using clpBNR v0.9.3alpha Kinds regards, JCR

ridgeworks commented 3 years ago

Depends; what is the question? global_maximum finds the minimum of an expression by performing a bifurcating search on the variables in that expression. So I think the question you want to ask is what is the global maximum of the expression (X2 + X3 - X2 * X3) * X4:

?- [X2, X3, X4]::real(0.01, 0.99),global_maximum((X2 + X3 - X2 * X3) * X4, MAX,4).
X2::real(0.009999999999999998, 0.9900000000000001),
X3::real(0.009999999999999998, 0.9900000000000001),
X4::real(0.4999451934462877, 0.9900000000000001),
MAX:: 0.989... .

Due to the characteristics of this expression, this takes several seconds so I specified a reduced precision (4). Note that this just provides the maximum value of the expression, not the minimizers (values of X2, X3,X4). This takes an additional solver step:

?- [X2, X3, X4]::real(0.01, 0.99),global_maximum((X2 + X3 - X2 * X3) * X4, MAX,4), splitsolve([X2,X3,X4]).
X2:: 0.98390...,
X3:: 0.98999...,
X4:: 0.99000...,
MAX:: 0.98984... .

And this is just one of many approximately equal solutions since the precision is limited.

This expression is also somewhat problematical because the solution is not a local optima internal to the solution "box" (defined by [X2,X3,X4]); the minimum occurs at the edge defined by X2=0.99, X3=0.99, X4=0.99. So local optima constraints which could help precision and convergence cannot be applied.

The issue with your query example is that the global_maximum predicate has no visibility into the expression variables; it only sees X1. While it might be possible to extract those variables from the constraint network somehow, it's just simpler all around if the application deals with it.

The new section in the "User Guide" on "Global Optimization" should help with this. Let me know if it doesn't, or if I haven't addressed your question.

JeanChristopheRohner commented 3 years ago

Thanks for helping me understand. I meant to say that i was expecting a maximum around 1. But i realise now that the equation i had has several maxima.

I am finding global_maximum and global_minimum very useful, and when there are unique maxima or minima i usually get the minimizers. For example [X2, X3]::real(0.1, 0.9), {X1 == X2**X3}, global_maximum(X1, _). yields

X2:: 0.9000...,
X1:: 0.98951...,
X3:: 0.10000... .

But would you consider this an incorrect use of these predicates? I.e. compared to having the whole equation as the first argument in global_maximum.

ridgeworks commented 3 years ago

Thanks for helping me understand. I meant to say that i was expecting a maximum around 1. But i realise now that the equation i had has several maxima.

Mathematically, there is only one maximum (0.98901) when all the values are 0.99, but global_maximum does a bifurcating search until the precision limit is met. Further sub-divsion when generating the minimizers (using splitsolve) is also precision limited so you wind up with many practically identical answers within the bounds of the max value of the expression (same issue as finding roots of polynomials). If there were several actual maxima, splitsolve will separate them but may still generate many solutions for each actual maxima due to the limitations of the search predicates. (Sorry if this is too much information.)

The main thing is to recognize the built-in limits based on precision. If the fixed point iteration was left to run until all 15 digits (or so) were correct, most everything would just take too long.

But would you consider this an incorrect use of these predicates?

As your first example demonstrates, there are many cases where not using the expression (which provides global_maximum visibility into the expression variables) will not generate the results you would expect. But you can do something like:

?- [X2, X3]::real(0.1, 0.9), X1 = X2**X3, global_maximum(X1, X1_max).
X1 = X2**X3,
X2:: 0.9000...,
X3:: 0.10000...,
X1_max:: 0.98951... .

where X1 is the symbolic expression and X1_max is the maximum value of the evaluation of X1 given the interval values of X2 and X3.

In an ideal world, these two usage patterns should be the same, but I don't yet know how to get there.

Also, there may be examples where the variables narrow sufficiently to be deemed minimizers in the process of finding the global maximum, but that's not guaranteed even if they are unique (general "dependency" issue of interval arithmetic). In this example where each variable is used exactly once, everything works in your favour.

JeanChristopheRohner commented 3 years ago

Thanks again for explaining. The reason I am looking at the possibility to just use a variable as an argument to global_maximum/minimum is that, conceptually, my programs look like the one below and that i am using meta interpreters (in the example i am using integers to simplify the output). My (maybe subjective) perception is that in all cases in which there is a clear cut maximum or minimum I have been able to get it by just using a variable (instead of having the whole equation as an argument to global_maximum). Sorry if this is a bit abstract, but as soon as our paper gets accepted for publication (or rejected, hehe) I will be able to tell you more (if you are interested). Generally i am finding clpBNR extremely useful.

:- use_module(library(clpbnr)).

x1(X1):- [X1]::integer(2, 10).
x2(X2):- [X2]::integer(2, 10).
x3(X3):- [X3]::integer(2, 10).
x4(X4):- x1(X1), x2(X2), {X4 == X1 + X2}.
x5(X5):- x3(X3), x4(X4), {X5 == X3 - X4}.

prove(true, true):- !.
prove((G1, G2), (P1, P2)):- !, prove(G1, P1), prove(G2, P2).
prove(G, P):- predicate_property(G, imported_from(_)), !, call(G), P = subproof(G, true).
prove(G, subproof(G, Sub)):- clause(G, B), prove(B, Sub).

test:- prove(x5(X5), Proof), global_maximum(X5, _), writeln(Proof).
ridgeworks commented 3 years ago

I don't know enough to comment on whether all your use cases will work using this pattern. In particular, I suspect there will be issues if the underlying expressions are non-linear and contain multiple occurrences of one or more variables (directly or indirectly).

In this example the equations are linear, single use (X5==X3-(X1+X2)) so it's safe. But be wary that if you encounter unexpected results, this may well be an issue as your initial example demonstrated.

I'm always looking for applications of this technology, so when the dust settles I'd be very interested in reading your paper (or a draft).

JeanChristopheRohner commented 3 years ago

OK I understand. Could you recommend a strategy that I could use to systematically test if using just a variable in global_maximum/minimum works as expected (in cases where there is a clear cut max or min)? In the paper I mentioned I used clpr and just iterated through all the solutions (then using max_list/2 to find the max). This works but is a bit crude, so it would be nice to have something more powerful. In my application i need at most a precision of 4 decimals.

ridgeworks commented 3 years ago

Without thinking about it much, a possible test is to expand the variable back an expression of interval values, so using your last example, expand X5 to an expression in X2,X3,X4:

 X5 --> X3-X4 --> X3-(X1+X2)

If the expanded expression contains only single instances of any of the variables, I think it will work as you expect. Note that your original example (X1 --> (X2 + X3 - X2 * X3) * X4) does not have this property, i.e., multiple instances of X2 and X3 that cannot both be factored out.

I don't really understand what you're trying to model or what kind of answers you expect to be produced, but let me turn the question around. Why not allow expressions to be defined in your x? rules, as in:

x1(X1):- X1::integer(2, 10).
x2(X2):- X2::integer(2, 10).
x3(X3):- X3::integer(2, 10).
x4(X4):- x1(X1), x2(X2), X4 = X1 + X2.
x5(X5):- x3(X3), x4(X4), X5 = X3 - X4.

and:

test:- prove(x5(X5), Proof), global_maximum(X5, X5_max), writeln(Proof).

While X5 is an expression, X5_max should be the same as the X5 you're currently producing. (Any other expressions can be converted to intervals by {X == Exp}.)

And global_maximum should work as per design intent independent of the nature of the underlying expressions.

JeanChristopheRohner commented 3 years ago

"Why not allow expressions to be defined in your x? rules, as in..."

Thats really interesting, and a new insight for me! I think it's fascinating that Prolog is so good at symbolic manipulation.

I experimented with having expressions in my rules and handing an expression to global_maximum/2 as you suggested. I am getting the expected solutions but I can't yet figure out how to get a Proof that contains the results of calculations at each step, instead of the actual expression.

With my old approach:

:- use_module(library(clpBNR)).

x1(X1):- [X1]::integer(2, 10).
x2(X2):- [X2]::integer(2, 10).
x3(X3):- [X3]::integer(2, 10).
x4(X4):- x1(X1), x2(X2), {X4 == X1 + X2}.
x5(X5):- x3(X3), x4(X4), {X5 == X3 - X4}.

prove(true, true):- !.
prove((G1, G2), (P1, P2)):- !, prove(G1, P1), prove(G2, P2).
prove(G, P):- predicate_property(G, imported_from(_)), !, call(G), P = subproof(G, true).
prove(G, subproof(G, Sub)):- clause(G, B), prove(B, Sub).

test:- prove(x5(X5), Proof), global_maximum(X5, _), writeln(Proof).

The resulting proof came out as (I added line breaks and tabs for readability):

subproof(x5(6),(
    subproof(x3(10),
        subproof([10]::integer(2,10),true)),
    subproof(x4(4),(
        subproof(x1(2),
            subproof([2]::integer(2,10),true)),
        subproof(x2(2),
            subproof([2]::integer(2,10),true)),
        subproof({4==2+2},true))),
    subproof({6==10-4},true)))

With your suggested approach:

:- use_module(library(clpBNR)).

x1(X1):- [X1]::integer(2, 10).
x2(X2):- [X2]::integer(2, 10).
x3(X3):- [X3]::integer(2, 10).
x4(X4):- x1(X1), x2(X2), X4 = X1 + X2.
x5(X5):- x3(X3), x4(X4), X5 = X3 - X4.

prove(true, true):- !.
prove((G1, G2), (P1, P2)):- !, prove(G1, P1), prove(G2, P2).
prove(G, P):- predicate_property(G, imported_from(_)), !, call(G), P = subproof(G, true).
prove(G, subproof(G, Sub)):- clause(G, B), prove(B, Sub).

test:- prove(x5(X5), Proof), global_maximum(X5, _), writeln(Proof).

I get:

subproof(x5(10-(2+2)),(
    subproof(x3(10),
        subproof([10]::integer(2,10),true)),
    subproof(x4(2+2),(
        subproof(x1(2),
            subproof([2]::integer(2,10),true)),
        subproof(x2(2),
            subproof([2]::integer(2,10),true)),
        subproof(2+2=2+2,true))),
    subproof(10-(2+2)=10-(2+2),true)))

I preferred getting the outcome of a calculation at each step in the proof, e.g. x5(6) instead of x5(10-(2+2)) in the first line of the proofs, but I realise that it might be difficult to implement this (since an expression is passed around in the last example). I spent a number of hours modifying my meta interpreter trying to achieve this but ultimately had no success.

I will try to experiment some more with {Expression == Interval} as you suggested; maybe a solution is to have an additional argument in prove/2 that holds the interval... A hurdle is that I am not in control of which predicates and arguments are mathematical expressions in a program and which are not (my presumed end users are the one's that are supposed to write clauses analogous to the x1, x2, ... examples above).

Thanks anyway for taking your time to look at my problem! I really appreciate it.

ridgeworks commented 3 years ago

I see your point. If this expression route is useful, you might consider a "post-processing" step to generate a a more user friendly form, e.g.,

x1(X1):- [X1]::integer(2, 10).
x2(X2):- [X2]::integer(2, 10).
x3(X3):- [X3]::integer(2, 10).
x4(X4):- x1(X1), x2(X2), X4 = X1 + X2.
x5(X5):- x3(X3), x4(X4), X5 = X3 - X4.

prove(true, true):- !.
prove((G1, G2), (P1, P2)):- !, prove(G1, P1), prove(G2, P2).
prove(G, P):- predicate_property(G, imported_from(_)), !, call(G), P = subproof(G, true).
prove(G, subproof(G, Sub)):- clause(G, B), prove(B, Sub).

test:- prove(x5(X5), Proof), global_maximum(X5, _), reduce_proof(Proof,RProof), writeln(RProof).

reduce_proof(true, true) :- !.
reduce_proof(subproof(G,P), subproof(RG,RP)) :-
    G=..[I,Exp], !,
    catch(RExp is Exp,_,RExp=Exp),  % if evaluation traps, no change
    RG=..[I,RExp],
    reduce_proof(P,RP).
reduce_proof(subproof(G,P), subproof(G,P)).
reduce_proof((P1,P2), (RP1,RP2)) :- !,
    reduce_proof(P1,RP1),
    reduce_proof(P2,RP2).

which outputs (suitably hand formatted):

subproof(x5(6),(
    subproof(x3(10),subproof([10]::integer(2,10),true)),
    subproof(x4(4),(
        subproof(x1(2),subproof([2]::integer(2,10),true)),
        subproof(x2(2),subproof([2]::integer(2,10),true)),
        subproof(2+2=2+2,true))
        ),
    subproof(10-(2+2)=10-(2+2),true)
    )
)

Additional reduce_proof clauses could be added to handle other cases, but I don't know whether this is general enough to handle all your use cases.

JeanChristopheRohner commented 3 years ago

Thank you! I major obstacle for me was how to distinguish between compounds that are non numeric and the ones that are; your catch/3 does the trick. I'll try to integrate your code in what i am building.

ridgeworks commented 3 years ago

One last caveat: This is all good if the intervals narrow to point values (integers or rationals), but is will fail with intervals that don't narrow to a point (due to floating point imprecision). If that's a use case you want to support, than you probably have to add additional clauses to reduce_proof.

Aside: I find turning exceptions into failures using catch a very useful technique. Prolog (unfortunately IMO) generates many exceptions where a logical failure would be more useful. catch is relatively expensive but the alternative is extensive pre-filtering on arguments which is a pain.

JeanChristopheRohner commented 3 years ago

Thanks for the tips!