jump-dev / MathOptInterface.jl

A data structure for mathematical optimization problems
http://jump.dev/MathOptInterface.jl/
Other
397 stars 87 forks source link

Set request: WeightedPowerMeanCone #977

Closed chriscoey closed 2 years ago

chriscoey commented 4 years ago

The power cone in MOI is only 3-dimensional, but in some places it is defined in a much more general way, such as in the Conic Benchmark Format (see CBF docs page 13). I'd propose to use that definition (Hypatia uses it too), even though arguably the ability to use the norm on the RHS might be of questionable utility. edit: out of scope

The geometric mean cone in MOI is (n+1)-dimensional but only allows powers 1/n. , when really it could use any vector of powers on the unit simplex (positive values summing to 1 - see "Constructing Self-Concordant Barriers for Convex Cones" by Nesterov; Hypatia supports it). We could either generalize the geomean cone for more general powers, or add a new cone (called what? maybe GeneralizedGeometricMeanCone? I don't love it). edit: out of scope

edit: Proposal here is for the (1+n)-dim WeightedPowerMeanCone (y, x) in R x R_{>= 0}^n : y^sum(alpha) <= prod x_i^alpha_i. This is the hypograph of a concave product of powers on unit simplex alpha/sum(alpha).

chriscoey commented 4 years ago

Anyone have thoughts on this?

chriscoey commented 4 years ago

Given how long it takes to add a cone and associated bridges, I lack the appetite for adding the CBFv3 n-dimensional power cone of questionable utility, but I will try adding the generalized geomean cone in #997

blegat commented 4 years ago

We could have

See https://en.wikipedia.org/wiki/Generalized_mean#Geometric_mean for the name WeightedPower. I am not saying we should add all these 4 cones but we should have a consistent long term naming strategy.

Then we have the bridges: 1) GeneralizedPowerCone is bridged to WeightedPowerMeanCone and SOC 2) WeightedPowerMeanCone is bridged to PowerCone 3) GeometricMeanCone is bridged to WeightedPowerMeanCone (just sets all alpha_i to 1/n)

Then we may not need https://github.com/JuliaOpt/MathOptInterface.jl/issues/406 as it's just a combination of bridges 2) and 3).

chriscoey commented 4 years ago

This sounds like too many cones some of which aren't necessarily very practically useful in modeling. If we are to add a cone like CBF's power cone with the ||x||_2 on one side, I think it would only really be for the sake of compatibility with CBF, because as I've said I just don't think it's a practically useful cone in modeling. What is most practically useful IMO is the prod t_i^(alpha_i) >= x^(sum(alpha)) cone, which is what you called WeightedPowerMeanCone (I'm OK with that name). The reason I don't restrict to alpha on the simplex, just positive alpha, is if you have integer alphas then you can represent the alpha vector with integers (which are not type T of the model, so we shouldn't type them that way), and you don't get Floating point / round off errors in the powers. No need to restrict to alpha on simplex when alpha positive will do.

So I would propose to start by adding the most practically useful cone, which is the (1+n)-dim WeightedPowerMeanCone prod t_i^(alpha_i) >= x^(sum(alpha)). And depending on how that goes, and whether we care about CBF's power cone definition, later consider adding that.

A question is: why shouldn't we just generalize MOI's current 3-dim power cone for n-dim (edit: I mean just change that existing definition to have a vector of exponents instead of 1 exponent), which gives us the WeightedPowerMeanCone above? MOSEK and SCS only support the 3-dim version, so then the bridge would be from an n-dim to n 3-dim cones of the same type plus an LP inequality. That avoids the need to add any new cone definitions immediately, but it is breaking. edit: But the fact that it is breaking should not deter us - we want to avoid useless set proliferation.

edit: in the last paragraph I overlooked the fact that the 3dim power definition has an absolute value on one side, whereas the hypograph set WeightedPowerMeanCone does not have the absolute value. Sigh. The absolute value isn't really useful from a modeling perspective, but it is what MOSEK and SCS use. So yes we need a new WeightedPowerMeanCone. So is it justified to have the existing GeometricMeanCone, which is a special case of WeightedPowerMeanCone with exponents a vector of ones, separate from the new WeightedPowerMeanCone cone? For the sake of dispatch and bridge efficiency I guess it does make sense for them to be separate.

odow commented 4 years ago

It feels like a few of these cones could be kept in Hypatia for now? Or some "ConicExtensions.jl" package? We're adding cones that only one (unreleased) solver supports.

chriscoey commented 4 years ago

We have had several cones that only one unreleased solver supports for a while now, such as the geometric mean cone, the logdet and rootdet cones. They are useful even if no solver supports them.

I am only proposing to add one useful cone that generalizes the geometric mean here, and if we so decide then another one that is recognized by CBF but not MOI (but as I've made clear, I don't particularly care for this cone, even though it is in Hypatia).

chriscoey commented 4 years ago

I've updated the issue title to narrow the scope

chriscoey commented 4 years ago

And here is a proposed definition: https://github.com/JuliaOpt/MathOptInterface.jl/pull/997/files#diff-0ffd5beb53539c0ff8252e848c93469eR245

An important question is whether to type the cone with T and the exponent vector with Vector{T}. This is what is done for the existing 3D power cone currently, but I don't understand the purpose of that type restriction for the exponent. @blegat why not just use exponent::Real there and define the 3d power cone without a type parameter?

blegat commented 4 years ago

This sounds like too many cones some of which aren't necessarily very practically useful in modeling. If we are to add a cone like CBF's power cone with the ||x||_2 on one side, I think it would only really be for the sake of compatibility with CBF, because as I've said I just don't think it's a practically useful cone in modeling.

I agree, I just wanted to get the big picture to see that Generalized could mean many things.

A question is: why shouldn't we just generalize MOI's current 3-dim power cone for n-dim

How would Mosek, SCS, ECOS say that they only support the 3-dim version ? If you add the length in the type definition and the user creates all power cones of length 1 to N then you end up compiling N times more code. So it's better to have one special case for the 3-dim and one generic case. Mosek, SCS, ECOS does not support the generic case and it needs to be bridges to many 3-dim cones.

blegat commented 4 years ago

@odow I mentioned GeneralizedPowerCone and GeneralizedGeometricMean just for the sake of experiementing with naming. It would make sense to add GeneralizedWeightedPowerMeanCone to MOI as it's part of CBF Adding WeightedPowerMeanCone is then interesting to add as it allows to share bridges between GeneralizedWeightedPowerMeanCone and GeometricMeanCone

chriscoey commented 4 years ago

@blegat re: your comment https://github.com/JuliaOpt/MathOptInterface.jl/pull/997#discussion_r363411539, I asked above:

An important question is whether to type the cone with T and the exponent vector with Vector{T}. This is what is done for the existing 3D power cone currently, but I don't understand the purpose of that type restriction for the exponent. @blegat why not just use exponent::Real there and define the 3d power cone without a type parameter?

blegat commented 4 years ago

With exponent::Real, the exponent is not fully typed so it's going to cause inference issues. For the 3-d power cone, the exponent cannot be integer so in which case T would not work ?

chriscoey commented 4 years ago

OK maybe I misunderstood. So can type of the exponent be different from the type of the real used in the VAF? i.e. is VAF{T1<:Real} in Power{T2<:Real} with T1 != T2 allowed? I assumed they were being forced to be the same, which would seem unnecessary.

blegat commented 4 years ago

It's not forced to be the same, in the @model macro, the sets that are put in the "typed vector sets" tuple will be typed by {T} where T is the coefficient types of the function. However, you can also add Power{Float32} in the "untyped vector sets" tuple if you want, or just rely on the UniversalFallback to support your constraint.

chriscoey commented 4 years ago

OK thanks, that wasn't obvious to me. In that case there is no issue. I'll amend the cone definition to be ...Cone{T} with exponents::Vector{T}

odow commented 2 years ago

Is this still important? What solvers support it?

chriscoey commented 2 years ago

It's a useful discussion, but if Benoit agrees then it can be closed

odow commented 2 years ago

Closing for now as out-of-scope. Let's wait until we have someone ask for the modeling capability and/or a few solvers natively support it.