Closed ShuhuaGao closed 4 years ago
I use ibex for solving financial problems, where i care a lot about execution time.
i have found that the -t
option is very good for me. my ibex runs typically don't end (more than 12h-24h execution time), but a good enough solution is found in a second or less.
I haven't been able to use the -a
-r
or --eps-h
options with much confidence for my problems, and i have tested/bench-marked them.
I just noticed there is a --kkt
option which changes the contractor logic, so that may be something to try out.
i suspect that you can use --trace
and then use the last output as a --initial-loup
param, but i haven't seen documentation for this, and i haven't used this myself.
Hi, @boxxxie , thank you very much for these tips. I have already used -t
to specify limited runout time. Nonetheless, I want a tight bound of f^*
and that is why I have to make ibexopt run longer. For my problem including 7 parameters, ibexopt has run more than 15 hours (still running now).
Besides, in case you do not know it, it is useful to specify the input COV file with -i
for warm starting. That is, if you first asked ibexopt to run 2 hours but the result was not satisfactory, you can give ibexopt the input COV file generated in the previous run to continue optimization instead of starting from scratch. The effect should be similar to --trace
and --initial-loup
. As for my problem, it seems the upper bound of f*
is very tight now but the lower bound keeps 0 (which ideally should be very close to the upper bound).
However, I still don't know how to specify a near-optimal solution as the initial point (assuming I have knowledge of it), or whether supplying such an initial solution makes any difference for ibexopt.
I will try the --kkt
option and also wait for advice from @gchabert .
Hi guys. You are right, documentation is still missing, sorry about that... and you already quite got by with ibexopt though! As Shuha suggest, the -i
option combined with -t
is very useful. You never lose what you have calculated and the effect is even better than using --trace
and --initial-loup
, because you keep the whole optimizer state.
You have to use --initial-loup
when you have an a priori upper bound of your objective (say, by some external knowledge of your problem). Sometimes it really accelerates the process. However, the difficulty in global optimization is often more in proving that a minimum is global that just finding it. So, even if you supply the actual global minimum f*
with initial-loup
, ibex will still basically have to explore everything to prove it is indeed the global minimum. Said differently, the hardest part is often the lower bound (as you have already experienced, by the way).
@ShuhuaGao, If you have a least-square problems with bound constraints only, the --kkt
option is certainly a good choice !
But I would not recommend it otherwise as this would make ibexopt switch automatically to the ``rigor'' mode (which is usually slower).
Furthermore, ibexopt is a default strategy, not always the best. What you can do, if you problem is not confidential, is to submit it to the community. We have other strategies to try.
Besides that, I don't have a magic answer. Interval methods are by nature powerful but slow.
DDM.txt
Hi, @gchabert, thank you for your reply. The latest status is, after another around 10 hours, I have f* in [-0, 2.52861312207e-05]
. The upper bound has been slightly reduced, but the lower bound is still zero. However, I want to see that the lower bound is very close to the upper bound to show the solution at my hand is effectively a global minimum.
The problem formulation in MINIBEX is attached ("DDM.txt") and also shown below. I am new to interval analysis and optimization, and thus not sure whether some kind of transformation/reformulation may help.
Constants
N = 26;// number of measures
// Data for output voltage in V
Vl[N] = (-0.2057; -0.1291; -0.0588; 0.0057; 0.0646; 0.1185; 0.1678; 0.2132; 0.2545;
0.2924; 0.3269; 0.3585; 0.3873; 0.4137; 0.4373; 0.459; 0.4784; 0.496; 0.5119;
0.5265; 0.5398; 0.5521; 0.5633; 0.5736; 0.5833; 0.59);
// Data for output current in A
I[N] = (0.764; 0.762; 0.7605; 0.7605; 0.76; 0.759; 0.757; 0.757; 0.7555; 0.754;
0.7505; 0.7465; 0.7385; 0.728; 0.7065; 0.6755; 0.632; 0.573; 0.499; 0.413;
0.3165; 0.212; 0.1035; -0.01; -0.123; -0.21);
k = 1.3806503*1e-23;// Boltzmann constant in J/K
T = 33 + 273.15;// Cell temperature in K
q = 1.60217646*1e-19;// Magnitude of charge on an electron in C
Variables
Iph in [0,1];// Photo-generated current in A
Isd1 in [0,1];// Reverse saturation current in uA (first diode)
Isd2 in [0,1];// Reverse saturation current in uA (second diode)
Rs in [0,0.5];// Series resistance in ohm
Rsh in [0,1];// Shunt resistance in hundred-ohm
n1 in [1,2];// Diode ideality factor (first diode)
n2 in [1,2];// Diode ideality factor (second diode)
function Id(VL, IL, Isd, Rs, n)
return Isd*1e-6*(exp(q*(VL+Rs*IL)/(n*k*T)) - 1);
end
// given VL and IL, predict the output with our model
function fIL(VL, IL, n1, n2, Iph, Isd1, Isd2, Rs, Rsh)
return Iph - Id(VL, IL, Isd1, Rs, n1) - Id(VL, IL, Isd2, Rs, n2) - (VL + Rs*IL)/(100*Rsh);
end
// minimize the error between prediction and true measurement
Minimize
sum(i = 1:N, (I(i) - fIL(Vl(i), I(i), n1, n2, Iph, Isd1, Isd2, Rs, Rsh))^2);
By applying some heuristic algorithms, the global minimum is likely to be 2.5096940710399997e-5
. How can I show that a slightly smaller number like 2.4e-5
is a lower bound with ibexopt if possible? 😅
Thanks for posting your problem! We'll try alternate strategies. To be honest, I am afraid ibexopt does not really suit for solving least-square problems because the objective expression gets very long with the number of measurements (and there are 26 in your case).
The easiest way to prove that "lb" is a lower bound is by introducing an inequality f(x)>=lb. (Note that your problem would not be unconstrained anymore). To avoid repetition of the objective expression f(x), use a slack variable "y", that is:
Minimize y;
Constraints
y=f(x);
y>=lb;
end
Tell me if it works.
Hi!
I think that for proving that lb is a lower bound you have to solve the problem f(x)<lb. If this problem has no solution then lb is a lower bound.
I do not understand well your strategy Gilles..
El El dom, 25 de oct. de 2020 a las 14:08, Gilles Chabert < notifications@github.com> escribió:
Thanks for posting your problem! We'll try alternate strategies. To be honest, I am afraid ibexopt does not really suit for solving least-square problems because the objective expression gets very long with the number of measurements (and there are 26 in your case).
The easiest way to prove that "lb" is a lower bound is by introducing an inequality f(x)>=lb. (Note that your problem would not be unconstrained anymore). To avoid repetition of the objective expression f(x), use a slack variable "y", that is:
Minimize y; Constraints y=f(x); y>=lb; end
Tell me if it works.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ibex-team/ibex-lib/issues/477#issuecomment-716179531, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHLU7WL3C4C47EMFNZO7RTSMRLQHANCNFSM4S4Y67PQ .
-- Ignacio Araya Profesor Escuela de Ingeniería Informática Pontificia Universidad Católica de Valparaíso
I think that for proving that lb is a lower bound you have to solve the problem f(x)<lb. If this problem has no solution then lb is a lower bound.
Yes, I agree with you @rilianx. Thanks.
I actually want to find the greatest lower bound (i.e., the infimum), though it seems not possible to locate the exact value in practice due to the extremely long running time. I will try some reasonable numbers to test feasibility.
I think that for proving that lb is a lower bound you have to solve the problem f(x)<lb. If this problem has no solution then lb is a lower bound.
Yes, you are right of course. Thank you for correcting me.
@gchabert
Tell me if it works.
I tried the feasibility problem (LB-feasibility.txt). Unfortunately, it is still as slow as before. That is, ibexopt states "time limit reached" and "no feasible point found". Of course, it is expected that there are no feasible points but we cannot confirm it since the time limit was reached.
Now we may conclude partially that interval-based optimization is not suitable for my problem. Nonetheless, ibexopt worked well for a simpler 5-parameter case taking about 4 hours.
Hi all !
As Gilles explained, it is typical that, given the global minimum f* of your problem, proving
f(x) <= f* - epsilon x in [x]
is infeasible for a small epsilon turns out to be a hard problem.
You may improve this situation using the minimizer x* as well. Give a try to solving
f(x) <= f || x - x || >= delta x in x
for some delta. Luckily, it may happen that you quickly prove emptyness for rather small values of delta. I would try different values of delta just to see how much this improves, like delta in { 10 , 1 , 0.1 , 0.01 }.
If this happens to actually improve the timing, then for a helpful value of delta you need to prove that there is no local minimizer other than x inside the || x - x || <= delta. There are several ways to do this, we can discuss this if some reasonable values of delta help.
This technique is a simple instanciation of exclusion regions, often linked to Arnold Neumaier and Hermann Schichl.
Best,
Alexandre
f(x) <= f || x - x || >= delta x in x
Thanks @goldsztejn-a . I will try it and give you feedback soon.
@goldsztejn-a
Progress update: following your suggestion, I decreased \delta gradually according to the existing x and f known to me. Results show that the global solution lies in a very narrow range centered at x. For 6 out of the 7 parameters, the range radius is 2.5e-3
while for the remaining parameter is 2.5e-4
. (Please see d1
and d2
in DDM-SSE-delta.txt). If I further reduced the range, ibexopt* took again a very long time and could not confirm the feasibility.
@gchabert Hi, could you kindly answer two questions?
Great !
Therefore, with delta=0.01 you can easily prove that
f(x)<=f x in [x] || x-x ||_oo >= delta
is proved infeasible. Since you used an infinite norm, ||x-x*||<=delta is just the box [x-delta,x+delta]. Since one constraint is active at the minimizer, [x-delta,x+delta] is not included in the initial box [x]. Therefore define
[xE] = [x-delta,x+delta] /\ [x]
Now, you just need to prove that there is one unique local minimizer x in [xE] and that f(x)=f*. This can be done by solving the first order conditions in the box, like KKT conditions. Since the box [xE] is rather small, you have good chances to do that efficiently !
In case of success, you have proved that f* is the global minimizer.
Alexandre
Shuha, to test Alexandre's idea, I suggest you to write a small C++ code like this:
#include <ibex.h>
using namespace std;
using namespace ibex;
int main() {
System sys("DDM.txt");
KuhnTuckerSystem kkt(sys,true);
Vector xopt {{0.7607810787383162, 0.22597423935334135, 0.7493480732699083, 0.03674042971025965, 0.554854504198561, 1.4510167578821171, 1.9999999999998637}};
double d1=2.5e-3;
double d2=d1/10;
IntervalVector delta{{-d1,d1},{-d1,d1},{-d1,d1},{-d2,d2},{-d1,d1},{-d1,d1},{-d1,d1}};
kkt.box.put(0,sys.box & (xopt + delta));
DefaultSolver solver(kkt);
solver.solve(kkt.box);
solver.report();
}
See also http://www.ibex-lib.org/doc/system.html#kuhn-tucker-conditions. It should do the job; Of course, it may take a long time. But if it terminates with a unique certified solution, as Alexandre said, you've got your global minimizer! Just check that the minimum f you obtain corresponds to the value LB you used earlier to create your exclusion region (f must not be larger).
When checking whether a problem is feasible with ibexopt, it usually finishes immediately (no more than two seconds) if the problem is infeasible (e.g., you may run my code DDM-SSE-delta.txt to see the result). Is this check reliable in theory? I am totally new to interval arithmetic and constraint programming and don't know its working principle to check feasibility (why is it so quick?)
Yes, the infeasibility result is reliable. It is quick because, roughly speaking, constraint programming techniques are based on elimination reasoning so they are particularly well-adapted for proving infeasibility. You may be surprised since ibexopt is in contrast very slow on your initial problem DDM.txt. But consider that ibexopt probably spends only a very short amount of time to reduce the search space to a thin box around x. Maybe not as fast as you did by hand (as it does not know in advance the bound f) but, clearly, the hard part is inside. There may be some clustering effect inside.
As you have noticed, the numerical values in my problems are very small now. There is no need to worry about the round-off errors with interval arithmetic, right? In other words, do I need to scale the variables, e.g., multiply all numbers by 1e3?
You don't need to scale the variables: they all have similar magnitudes at the minimum, which is a good news. For the constants, I have to tell you that roundoff occurs when parsing a Minibex file. For instance, your decimal value "1.60217646*1e-19" is rounded when converted in binary format. If you want to handle this rounding error you have two options: either enter a small interval instead or directly use an hexadecimal number (preceded with an #).
@gchabert Thank you. I will try your code if needed. For now, I want to first check the first-order condition (i.e., zero gradients) to see whether there is a unique root in the narrow space (if I am lucky).
In fact, both metaheuristics and (local) NLP solver indicate that the xopt
(in your code) is the global minimizer; and I just want to verify it more rigorously.
Starting from this view, if I can find u*
and v*
together with xopt
that satisfy the KKT condition (we only have bound constraints here), then it is also shown that xopt
is the real global optimizer, right?
Of course, xopt
may not be exact enough to meet the KKT condition. Thus, I hope that solving the first-order condition in that narrow range will produce one or multiple but not many stationary points, one of which should be extremely close to xopt
.
In your code, it seems solver.solve(kkt.box)
locates the x*
, u*
and v*
directly. Perhaps our knowledge of the xopt
can help to accelerate.
(I will try these ideas tomorrow since it is midnight in Singapore now :sob:)
@ShuhuaGao
EDIT: In my program you can set the solver precision as follows:
DefaultSolver solver(kkt, eps);
I've tried with eps=1e-6
; it terminates after 4,5 minutes but returns a bunch of 'unknown' boxes, i.e., that require to be processed further. You may try with smaller values for eps.
But consider that ibexopt probably spends only a very short amount of time to reduce the search space to a thin box around x*.
@gchabert You are right. I played with the greatly reduced search domain. After 10 hours, the bound of f*
is [3.11953014135e-06,2.51301051574e-05]
, not much improvement over the original one [-0, 2.52861312207e-05]
. It is clear that ibexopt spent most of its time in a small local region around x*
.
@gchabert I tried eps=1e-9
, which took about 30 minutes. But the result is still "unknown".
done! but some boxes have 'unknown' status.
number of solution boxes: --
number of boundary boxes: --
number of unknown boxes: 6248
number of pending boxes: --
cpu time used: 732.162s
number of cells: 22863
@goldsztejn-a As for the naive first-order condition (only zero gradients), I realized that it was much more challenging than I thought. That is, it is even difficult to find all roots of the 7 nonlinear equations in a given range. To get precise results, we need to turn to ibexsolve and get a problem similar to the KKT condition.
Thought of the day: If the minimum of the least-square problem is very close to 0, the KKT conditions are probably ill-conditioned (close to singularity). This hinders the solver from isolating (and certifying) the solution.
Starting from this view, if I can find u and v together with xopt that satisfy the KKT condition (we only have bound constraints here), then it is also shown that xopt is the real global optimizer, right?
Well, if xopt, u and v satisfy KKT conditions, xopt is just a local optimizer. You can deduce that it is a global optimizer if you follow Alexandre's strategy. That's why you need to solve KKT conditions with ibexsolve.
That is, it is even difficult to find all roots of the 7 nonlinear equations in a given range
You should not have 7 equations but 8 since one bound constraint is active (upper bound of n2).
On second thought, I noticed that there was something wrong with my method above to exclude the region that cannot have global minima. In short, we cannot shrink the possible range of variables separately but instead, use some kind of vector norm to take them together (which should be what @goldsztejn-a really means).
Consider a naive example f(x1, x2) = x1^2 + x2^2
. If we know a near-optimal point xr = (0.1, 0.1)
. By adding three constraints
|x[1] - xr[1]| >= 0.5
|x[2] - xr[2]| >= 0.05
f <= 0.02
it is obvious that the above problem is infeasible. Following my previous idea, I draw the conclusion that the global optimizer lies in the region
|x[1] - xr[1]| <= 0.5
|x[1] - xr[2]| <= 0.05
which is unfortunately wrong.
Yes you did not use the infinite norm properly. Sorry, I did not pay attention to that. You should rather right:
min(|x[1] - xr[1]|-0.5,|x[2] - xr[2]|-0.05)>=0
However I don't see why the problem would be obviously infeasible.
You should rather right:
min(|x[1] - xr[1]|-0.5,|x[2] - xr[2]|-0.05)>=0
This condition seems also wrong because it is equivalent to the two inequality constraints I listed above. We should use the infinity norm here, that is, change min
to max
. With max
, the new problem is feasible, we thus cannot conclude that
the region max(|x[1] - xr[1]|-0.5,|x[2] - xr[2]|-0.05)<=0
contains the global minimizer (which is correct here).
Yes I meant max
Now let' consider the original 7-parameter NLSM problem. I have the near-optimal point xr
as shown above. Even with a quite loose constraint inf_norm(x - xr) >= 0.9
, ibexopt cannot tell the infeasibility in two minutes (I am afraid longer running time will not help). We are back to the starting point now 😭 .
I think you should not use ibexopt when you involve constraints like inf_norm(x-xr) >= 0.9.
Indeed, such a constraint is actually changing the optimization problem to a new one, with a new minimizer that could be (and is probably) on the boundary of this new constraint.
Basically, once you have some trusted x and f obtained from local search (or genetic algorithm, which I also call local search), you can solve two problems for a guessed value of delta. First:
(NCSP1) f(x)<=f || x - x || >= delta x in [x]
This one should be empty, and hopefully ibexsolve will prove this easily is delta is large enough.
Second:
(NCSP2) kkt(x)=0 || x - x^* || <= delta x in [x]
Here you should have only one solution and ibexsolve may find it quickly is delta is small enough.
They are both constraint satisfaction problems, so you should not use ibexopt. In (NCSP1) delta has to be large enough, and in (NCSP2) it has to be small enough. I would recommend to test several values of delta with (NCSP1) and choose the smallest for which you could prove emptyness. Or to be more efficient, you could solve problems
(NCSP1') f(x)<=f delta1 <= || x - x || <= delta2 x in [x]
to improve the value of delta which will be used in (NCSP2).
(NCSP2) is critical. It seems you have one active constraint so you may want to build yourself the kkt system and may be solve it in several steps, like with no active constraint first, and then with one active constraint. Having a good initial domain for the multipliers is probably critical. I am not sure how ibex handle this with the kkt system...
Best !
Thank you very much @goldsztejn-a for your detailed reply. It is definitely helpful in theory. However, according to my limited experiment with ibex, both problems seem quite challenging in practice.
delta
value is at least 0.9
, which is essentially useless, because the domain width is only 1. Perhaps some CSP solver may do better, but I don't have a sensible choice for now.2.5e-3
), ibexsolve cannot find (all) the roots of KKT and did not give useful results after running for hours. Anyway, thank you for your enlightening ideas and I will keep on exploring them.
Hum, then Gilles' initial remark seems to apply perfectly here ! In a least square problem with many measurements, the cost function has a huge expression and is not suited to interval computations...
Usually, minimax problems are considered harder than least square problems, but here it may turn out that a minimax solution is easier to compute using global optimization. I am not sure if this make sense with your problem, but my personal feeling is that minimax parameter optimization makes more sense that least square parameter optimization, but is harder for local solvers and therefore not preferred. Would such a model be of interest for you?
Hi, @goldsztejn-a , could you further explain minimax optimization? In my limited knowledge, minimax optimization is adopted when there is parameter uncertainty (like in robust optimization), but I cannot see why it is easier than least squares. It seems to me that it only increases complexity.
In fact, the problem we keep discussing is from a recent paper
Chenouard, Raphael, and Ragab A. El-Sehiemy. "An interval branch and bound global optimization algorithm for parameter estimation of three photovoltaic models." Energy Conversion and Management 205 (2020): 112400.
where the authors also adopt ibex and, like our result here, cannot get global optimum for the 7-parameter case. I want to do it myself to see whether we can push the result further.
The minimax parameter estimation problem I have in mind is
(MINIMAX) min_a max{ | f(a,x1) - y1 | , | f(a,x2) - y2 | , ... , | f(a,xK) - yK | } a in [a]
It consists of finding the parameter value that minimizes the worst error wrt to measurements.
The problem (MINIMAX) can be turned to a standard inequality constrained optimization problem:
(MINIMAX') min t -t <= f(a,xi) - yi <= t a in [a]
For local optimization, the least square problem is much easier than both (MINIMAX) and (MINIMAX'). For global optimization, it may turn out that (MINIMAX') is easier: each pair of constraint -t <= f(a,xi) - yi <= t has an easy expression !
Got your idea @goldsztejn-a . This is called maximum absolute error (MAE) and is essentially the infinity norm metric. In fact, this metric has also been attempted in the paper above. Their description shows that ibex cannot handle the MAE for 7 parameters either (in 20000 seconds). Nonetheless, the 5-parameter case with MAE is much faster than the least-squares counterpart, just as you expected. Besides, in the parameter identification literature, MAE is much less commonly used than least squares due to their sensitivity to outliers.
Ho yes, outliers are definitely another reason to prefer 2 norm !
On Thu, Oct 29, 2020 at 7:54 AM Shuhua Gao notifications@github.com wrote:
Got your idea @goldsztejn-a https://github.com/goldsztejn-a . This is called maximum absolute error (MAE) and is essentially the infinity norm metric. In fact, this metric has also been attempted in the paper above. Their description shows that ibex cannot handle the MAE for 7 parameters either (in 20000 seconds). Nonetheless, the 5-parameter case with MAE is much faster than the least-squares counterpart, just as you expected. Besides, in the parameter identification literature, MAE is much less commonly used than least squares due to their sensitivity to outliers.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ibex-team/ibex-lib/issues/477#issuecomment-718401863, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADF6P2FN36L47FN5KCMEFJ3SNEGSXANCNFSM4S4Y67PQ .
-- Dr. Alexandre Goldsztejn
CNRS - Laboratoire des Sciences du Numérique de Nantes Mobile : +33 6 78 04 94 87 Web: www.goldsztejn.com Email: alexandre.goldsztejn@ls2n.fr http://irccyn.ec-nantes.fr
Yes, this appears to be what we can do for now. Thanks, @goldsztejn-a . In fact, in my personal opinion, the current minimum attained in the literature is already small enough for practical purposes, and it does not make much sense in practice to pursue even smaller optimizers. It seems that one author of the above paper is with the same lab (CNRS) as you 😁.
Indeed, Raphael is a colleague of mine in Nantes ! We had discussed parameter optimization, and most probably will again in the future. Best !
On Thu, Oct 29, 2020 at 8:03 AM Shuhua Gao notifications@github.com wrote:
Yes, this appears to be what we can do for now. Thanks, @goldsztejn-a https://github.com/goldsztejn-a . In fact, in my personal opinion, the current minimum attained in the literature is already small enough for practical purposes, and it does not make much sense in practice to pursue even smaller optimizers. It seems that one author of the above paper is with the same lab (CNRS) as you 😁.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ibex-team/ibex-lib/issues/477#issuecomment-718405750, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADF6P2FHOUZ6LBBVZWWAPZLSNEHT3ANCNFSM4S4Y67PQ .
-- Dr. Alexandre Goldsztejn
CNRS - Laboratoire des Sciences du Numérique de Nantes Mobile : +33 6 78 04 94 87 Web: www.goldsztejn.com Email: alexandre.goldsztejn@ls2n.fr http://irccyn.ec-nantes.fr
Hi, I want to perform global optimization for a nonlinear problem with 5 variables, which is essentially a nonlinear least squares problem with only boundary constraints. It seems that
ibexopt
was very slow and took about 13000 seconds with default settings. Could you please give some advice on acceleration if possible? For example,ibexopt
if I know that it is close to the global optimum?ibexopt
? Of course, I don't want to lower the precision.Currently, I use the simplest form by typing
ibexopt problem.txt
whereproblem.txt
is the description of the problem with the Minibex language.Thank you.