cvxgrp / scs

Splitting Conic Solver
MIT License
545 stars 134 forks source link

Symmetricity of SDP #31

Closed karanveerm closed 9 years ago

karanveerm commented 9 years ago

Unless we(SCS.jl) are doing something wrong, it seems possible that SCS can return semi-definite matrices that are not symmetric.

using Convex, SCS
set_default_solver(SCSSolver())

y = Variable((2, 2), :Semidefinite)
p = minimize(y[1, 1], y[1, 2] == 1)
solve!(p)
y.value
   0           1.0    
 -1.0         0.58682

Is there a reason why it is possible for SDP matrices to not be symmetric? The README seems to say positive semidefinite cone { X | min(eig(X)) >= 0, X = X^T } in which case the matrix should be symmetric.

Here is the data in raw format:

A = [
 0.0 0.0 1.0 0.0;
 -1.0 0.0 0.0 0.0;
 0.0 -1.0 0.0 0.0;
 0.0 0.0 -1.0 0.0;
 0.0 0.0 0.0 -1.0]
c = [1.0,0.0,0.0,0.0]
b = [1.0,0.0,0.0,0.0,0.0]
f = 1
s = [2]
ssize = 1
mlubin commented 9 years ago

Do the constraints need to be symmetric? e.g. 0.5*y[1, 2] + 0.5*y[2, 1] == 1 instead of y[1, 2] == 1

SteveDiamond commented 9 years ago

I've always added a symmetry constraint, and SCS has worked fine. But I was looking through the SCS code, and something struck me as a little odd.

The code for projecting a matrix X onto the PSD cone actually projects (X + X')/2 onto the PSD cone. So it's not assuming that X is symmetric. But if X is not constrained to be symmetric, then the PSD cone is not self-dual. On R^{nxn}, the dual of symmetric PSD matrices is matrices Y such that Y+Y' is PSD. So I'm not sure what's going on.

Maybe it would be better if PSD cones had dimension n*(n+1)/2, i.e. only the lower triangular entries.

@bodono what do you think? I probably have misunderstood something.

SteveDiamond commented 9 years ago

What CVXOPT does is assume that in

minimize        c'*x 
subject to      A*x + s = b 
                s in K

the rows of A corresponding to the upper triangular parts of PSD blocks in s are assumed to be the same as the rows corresponding to the lower triangular parts.

bodono commented 9 years ago

@SteveDiamond is right, basically the cone of symmetric positive semi-definite cones is not self-dual in R^{n x n}, so the Moreau decomposition of any point into a member of the PSD cone and the dual of the PSD cone will end up with a non-symmetric part. SCS projects the dual variables onto the cone, so it ends up with a primal variable that might not be symmetric. I noticed this very early on when I was implementing the PSD cone. The right way to fix it is to make the PSD cone explicitly a subset of R^{n x (n+1)/2}. The reason I don't do this is because I wanted to use the same input format as sedumi, which unfortunately uses n x n. There are two ways we could think about fixing this:

1) Make scs.jl (and others) do as cvxpy does, and make the data matrix "symmetric" with respect to s. 2) Allow two ways to specify the PSD constraints in SCS, one n x n, and one n x (n+1)/2. cvxpy, scs.jl, cvx, and others can use the second way for all PSD cones, but we're still compatible with Sedumi input formats. This would likely take the form of a new field in the cone struct.

Thoughts?

mlubin commented 9 years ago

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only accepting the lower triangle of matrix coefficients: http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html. Input format and algorithmic implementation are two distinct issues. My vote would be for doing it the "right" way algorithmically so that it's impossible to create a nonsymmetric solution. You can always transform the user's input if they want to provide the problem in n x n format.

madeleineudell commented 9 years ago

first, a mathematical aside: @bodono, I was somewhat surprised/confused at first by your assertion that "the cone of symmetric positive semi-definite cones is not self-dual in R^{n x n}", so I'd like to give a very concrete example to help posterity understand what's going on. I'm going to call symmetric positive semi-definite SPSD in what follows, for clarity. The SPSD cone is self-dual in S^n, but not in R^{n x n}. Self-duality in S^n means that "Among all symmetric X and Y, Tr(X,Y) >= 0 for any Y PSD <=> X is PSD". Non-self-duality in R^{n x n} means "There is some asymmetric X for which Tr(X,Y) >= 0 for any Y SPSD." In fact, @stevediamond mentioned above that any X for which (X+X')/2 is PSD will work. Let's see that explicitly: how about X = [10 1; -1, 10]. Then (writing <X,Y> for the trace inner product) <X,Y> = <[10 0; 0, 10], Y> + <[0 1; -1, 0], Y> = 10 Tr(Y) + 0 > 0 for any Y SPSD.

I agree with Miles that it's important to make it impossible to create a problem for which SCS returns an asymmetric solution. Is there a good way to detect asymmetric problems and to symmetrize them inside SCS? The cvxpy solution right now is to add constraints to enforce symmetry of PSD variables, which is different from modifying the matrix A so that the algorithm will automatically enforce symmetry without adding extra constraints. But either of these, I think, should be done inside SCS (perhaps with an optional flag to turn off the behavior) rather than by its calling libraries.

On Fri, Jan 2, 2015 at 7:50 AM, Miles Lubin notifications@github.com wrote:

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only accepting the lower triangle of matrix coefficients: http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html. Input format and algorithmic implementation are two distinct issues. My vote would be for doing it the "right" way algorithmically so that it's impossible to create a nonsymmetric solution. You can always transform the user's input if they want to provide the problem in n x n format.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-68536086.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

madeleineudell commented 9 years ago

Here's a more concrete proposal. Suppose we have the (standard conic form) problem

minimize c'*x subject to Ax+b in K

and we know that x[i]==x[j] at any solution. Well then we can symmetrize A and c with respect to the indices i and j without changing the set of solutions. We just make the ith column of the A and the jth column of A both equal to the average of the ith and jth column, and similarly for c. Explicitly, let

A[:,i] <- (A[:,i] + A[:,j])/2 A[:,j] <- (A[:,i] + A[:,j])/2 c[i] <- (c[i]+c[j])/2 c[j] <- (c[i]+c[j])/2

and the set of solutions to the problem is preserved. But now there is nothing breaking the symmetry between i and j in the problem. As far as I understand the algorithm, that will imply that x[i]=x[j] at the solution returned by SCS so long as the algorithm is initialized symmetrically. (Others with more intuition, please correct me if that's wrong. I'm also not sure if the algorithm would be stable wrt perturbations away from symmetry.)

This kind of preprocessing could be imposed by any modeling layer that knows that part of a variable is supposed to be in the SDP cone, either in Convex.jl or in MathProgBase by exploiting our new very expressive conic form.

But this columnwise procedure is a little unnatural, because the problem statement really imposes symmetry on slack variables (corresponding to rows of A), rather than variables (columns). I suppose you could do the same thing for the rows of A, but I'm having trouble thinking of a row-wise scheme that guarantees a symmetric solution x. (Because even if the rows corresponding to SDP constraints on x are symmetric, we might have an asymmetry between entries of x in the objective.)

On Fri, Jan 2, 2015 at 12:52 PM, Madeleine Udell madeleine.udell@gmail.com wrote:

first, a mathematical aside: @bodono, I was somewhat surprised/confused at first by your assertion that "the cone of symmetric positive semi-definite cones is not self-dual in R^{n x n}", so I'd like to give a very concrete example to help posterity understand what's going on. I'm going to call symmetric positive semi-definite SPSD in what follows, for clarity. The SPSD cone is self-dual in S^n, but not in R^{n x n}. Self-duality in S^n means that "Among all symmetric X and Y, Tr(X,Y) >= 0 for any Y PSD <=> X is PSD". Non-self-duality in R^{n x n} means "There is some asymmetric X for which Tr(X,Y) >= 0 for any Y SPSD." In fact, @stevediamond mentioned above that any X for which (X+X')/2 is PSD will work. Let's see that explicitly: how about X = [10 1; -1, 10]. Then (writing <X,Y> for the trace inner product) <X,Y> = <[10 0; 0, 10], Y> + <[0 1; -1, 0], Y> = 10 Tr(Y) + 0 > 0 for any Y SPSD.

I agree with Miles that it's important to make it impossible to create a problem for which SCS returns an asymmetric solution. Is there a good way to detect asymmetric problems and to symmetrize them inside SCS? The cvxpy solution right now is to add constraints to enforce symmetry of PSD variables, which is different from modifying the matrix A so that the algorithm will automatically enforce symmetry without adding extra constraints. But either of these, I think, should be done inside SCS (perhaps with an optional flag to turn off the behavior) rather than by its calling libraries.

On Fri, Jan 2, 2015 at 7:50 AM, Miles Lubin notifications@github.com wrote:

FWIW, Mosek, the only commercial SDP solver, enforces symmetry by only accepting the lower triangle of matrix coefficients: http://docs.mosek.com/7.1/pythonapi/Semidefinite_optimization.html. Input format and algorithmic implementation are two distinct issues. My vote would be for doing it the "right" way algorithmically so that it's impossible to create a nonsymmetric solution. You can always transform the user's input if they want to provide the problem in n x n format.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-68536086.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

bodono commented 9 years ago

I don't want to have any sort of data cleaning or preparation inside of SCS (besides the normalization, which can actually be done 'matrix-free' anyway). The reason being that the code treats the cones and the linear systems entirely separately, so that you can add any new cones you like, or swap in a new linear system solver very easily. In fact you can run SCS without having access to the data directly, just an oracle that supplies Ax and A'x for input x. So the options are to do nothing and just assume that the input data matrix A respects the symmetry (or enforces a symmetry constraint as cvxpy does), or change the input format to only supply the lower triangular part of the primal variable. I prefer the second option, but it is a breaking API change so is pretty drastic.

mlubin commented 9 years ago

@bodono, data cleaning and preprocessing user input is quite a well accepted practice. User models typically have many variables that can easily be eliminated, constraints that are redundant, or coefficients that can be improved by logical implication. These tricks are especially useful for relaxations of integer programming-type problems, but still for LPs a speedup between 40% and 400% would not be surprising see here and here. I'd argue that these techniques are underutilized by conic solvers. I understand the desire to keep messiness like this out of SCS. The question really is where it should go. We'll be able to manage the symmetrization preprocessing somewhere in the Julia stack, but this means that it won't be available for other users like cvxpy. Would you guess that there's a measurable performance impact by adding n^2/2 extra (very sparse) equality constraints for each PSD cone?

madeleineudell commented 9 years ago

@bodono, what problem exactly is SCS solving? looking at KV's example from the top of this thread,

minimize Y[1,1] subject to -Y + S = 0 S >= 0,

SCS gives the answer Y = [0 1; -1 .586]. it's clear that there is no symmetric S such that -Y + S = 0, so it seems like SCS isn't solving the problem the documentation claims to solve, ie that S is in the SDP cone with SDP cone defined as { X | min(eig(X)) >= 0, X = X^T }.

Is it solving the problem with the asymmetric definition of SDP cone, ie SDP cone defined as { X | min(eig(X+X')) >= 0 }?

bodono commented 9 years ago

For what we're calling the primal problem it essentially uses the asymmetric version of the PSD cone, for the dual problem it's the symmetric definition. This doesn't really make sense from a user standpoint, so it should definitely change.

madeleineudell commented 9 years ago

It might be fine, so long as the documentation makes it clear that that's what it's doing.

Actually, that would allow you to solve problems involving only the asymmetric version of the PSD cone, or only the symmetric version, by passing either the primal or dual of the user's problem to SCS as it is implemented now. A user wishing to solve

minimize c'*x subject to b-Ax in K

where K might include some symmetric PSD cones (and no asymmetric ones) just needs to pass to SCS the problem

minimize -b'_y subject to A'_y + c = 0 y in K^*

(so now K^* has some asymmetric PSD cones in it, and no symmetric ones) and accept the dual solution to this problem as the primal solution to the original problem.

Have I understood this correctly?

On Tue, Jan 6, 2015 at 6:59 AM, bodono notifications@github.com wrote:

For what we're calling the primal problem it essentially uses the asymmetric version of the PSD cone, for the dual problem it's the symmetric definition. This doesn't really make sense from a user standpoint, so it should definitely change.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-68875388.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

bodono commented 9 years ago

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I don't think it's really useful to use the R^{n x n} one. How difficult would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables?

SteveDiamond commented 9 years ago

It would be very easy for cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables.

On Wed, Jan 7, 2015 at 3:35 AM, bodono notifications@github.com wrote:

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I don't think it's really useful to use the R^{n x n} one. How difficult would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables?

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-69010675.

madeleineudell commented 9 years ago

Yes, I think that would work fine for us too.

Just to clarify: as far as I understand, this is only a change in the size of the returned solution, not of the call. c, A, b and cones stay the same, yes? In that case, I'm not sure why SCS couldn't just perform the symmetrization (from a vector of length n(n+1)/2 to one of length n^2) internally, to prevent a breaking API change.

On Wed, Jan 7, 2015 at 3:28 PM, Steven Diamond notifications@github.com wrote:

It would be very easy for cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables.

On Wed, Jan 7, 2015 at 3:35 AM, bodono notifications@github.com wrote:

Technically the PSD cone is R^{n x n} is a different cone to Sn, and I don't think it's really useful to use the R^{n x n} one. How difficult would it be for Convex.jl or cvxpy to use a vector of length n x (n+1)/2 instead of n^2 for PSD cone variables?

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-69010675.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-69111607.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

bodono commented 9 years ago

A and b would have to change since the number of rows in A is equal to the sum of the sizes of all the cones, and we're making the PSD cones go down to size n(n+1)/2 instead of n^2. Any constraints on the upper triangular part of the semidefinite variable would have to be converted to constraints on the lower triangular part, so the structure would change too. Also cvxpy (and possibly cvx and Convex.jl?) would be able to drop the explicit S = S^T set of equality constraints for PSD variables, making the problems smaller again. The semidefinite variables solutions returned by SCS would just be the lower triangular part (stacked columnwise), so to get back the full matrix cvxpy, Convex.jl, cvx etc would have to fill out the upper triangular part.

madeleineudell commented 9 years ago

this raises an interesting point. right now we'd allow users to say

minimize y subject to [x y; z w] in SemidefiniteCone

(where x, y, z, and w are variables). if we remove the upper triangular part of A and b prior to giving the problem to SCS, then the problem is unbounded.

the easiest way to deal with this is still to add extra equality constraints (here, y==z), making the total number of rows of A and b to represent semidefinite variables n^2, where n(n+1)/2 of them come from the reduced size SDP constraint, and n(n-1)/2 come from the new equality constraints. this is super easy to implement: take the rows l and u corresponding to the lower and upper triangular versions of a transposed element in the SDP constraint. replace u by u-l, and set the cone for that row to the zero cone. it's also kind of nice that this means the sizes of everything are preserved.

how does this extra constraint affect the dual variables? how would we recover the dual to the constraint [x y; z w] in SemidefiniteCone interpreted as a symmetric SDP constraint, rather than (as implemented) as a lower triangular SDP constraint + equality constraint?

we could also do some kind of variable/constraint preprocessing to reduce the number of introduced constraints by logical implications. i'm not sure that would be worth the extra complexity.

---------- Forwarded message ---------- From: bodono notifications@github.com Date: Thu, Jan 8, 2015 at 2:11 AM Subject: Re: [scs] Symmetricity of SDP (#31) To: cvxgrp/scs scs@noreply.github.com Cc: Madeleine Udell madeleine.udell@gmail.com

A and b would have to change since the number of rows in A is equal to the sum of the sizes of all the cones, and we're making the PSD cones go down to size n(n+1)/2 instead of n^2. Any constraints on the upper triangular part of the semidefinite variable would have to be converted to constraints on the lower triangular part, so the structure would change too. Also cvxpy (and possibly cvx and Convex.jl?) would be able to drop the explicit S = S^T set of equality constraints for PSD variables, making the problems smaller again. The semidefinite variables solutions returned by SCS would just be the lower triangular part (stacked columnwise), so to get back the full matrix cvxpy, Convex.jl, cvx etc would have to fill out the upper triangular part.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-69159787.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

bodono commented 9 years ago

That problem is unbounded in either case, right? If you add z>=0 to the problem your point stands.

In that case you would have to, as you say, introduce the equality constraint z==y, and pass three variables (x,z,w) into the S_2 cone constraint. To get the dual variable to the original problem I think you would just do an affine combination of the dual variables from the equality constraint and the symmetric semidefinite dual variable. Interestingly this would end up with a non-symmetric dual variable if I'm doing it right. This seems like the correct approach from the parser, because the user will (generally) expect that [x y; z w] in SemidefiniteCone will enforce the constraint that z==y, and you guys will take care of that in the background.

madeleineudell commented 9 years ago

Yes, it looks like the equality constraint adds an skew symmetric part to the dual variable: the new lagrangian has terms mu(y-z) + <lambda, [x y; z w]> = <lambda + [0 mu; -mu 0], [x y; z w]>, so you'd probably want to return lambda + [0 mu; -mu 0] as the optimal dual variable.

That's very strange. I'm worried we'd be violating our contract with users by handing them an asymmetric optimal dual variable for a semidefinite constraint. But in fact, it is the SDP constraint that is doing the work to force y==z, and that forcing is asymmetric. So maybe it's fine.

On Fri, Jan 9, 2015 at 8:07 AM, bodono notifications@github.com wrote:

That problem is unbounded in either case, right? If you add z>=0 to the problem your point stands.

In that case you would have to, as you say, introduce the equality constraint z==y, and pass three variables (x,z,w) into the S_2 cone constraint. To get the dual variable to the original problem I think you would just do an affine combination of the dual variables from the equality constraint and the symmetric semidefinite dual variable. Interestingly this would end up with a non-symmetric dual variable if I'm doing it right. This seems like the correct approach from the parser, because the user will (generally) expect that [x y; z w] in SemidefiniteCone will enforce the constraint that z==y, and you guys will take care of that in the background.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/scs/issues/31#issuecomment-69354942.

Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell

karanveerm commented 9 years ago

FYI, both sedumi and sdpt3 also return skew symmetric duals.

bodono commented 9 years ago

I've pushed a new branch, symmetric_sdp, that has the interface change to specify semidefinite variables of size n * (n+1) / 2 (and no longer supporting n^2 sizes). It appears to work... The matlab and python interfaces are also updated (actually very little needed to change there). If you want to test it out with your parsers @SteveDiamond @madeleineudell @karanveerm and give feedback on how it is, that would be very useful. If it all looks good I would like to put this into the v1.1.0 release of SCS, which I hope to push to master sometime in mid February.