davidsd / sdpb

A semidefinite program solver for the conformal bootstrap.
MIT License
52 stars 42 forks source link

Using a varying number of blocks per constraint #47

Open nanleij opened 3 years ago

nanleij commented 3 years ago

Hi, I'm working on extending the (univariate) PMP to the multivariate setting, hopefully with SDPB as solver. A lot stays similar, the most important change is that you need more blocks per constraint, and the number of blocks can vary between constraints. I.e, in the univariate case you have (for positve definiteness on a set equation): equation whereas in the multivariate case you need (for positive definiteness on a set equation): equation where equation are polynomials. So you really need the option to add a different number of Y blocks per constraint (blocks_j contains more than 2 elements).

However, it seems like this is not supported at the moment. In the input format (the files pvm2sdp generates from xml), blocks.i and bilinear_basis.i assume 2 blocks per constraint, and also during the reading of those files it is assumed (the part with parity in sdpb/src/sdp_solve/SDP/SDP/read_bilinear_basis.cxx seems to have to do with whether it is the first or second block of a constraint). In most computations, it looks like the information about the blocks comes from block_info; if that's the case, adding support for an arbitrary number of blocks might not be too difficult.

This raises a few questions:

  1. To what extend is the assumption used in the solver itself?
  2. Can it be implemented in the (near) future? A problem might be the input format. Not only the reading would have to change, but also writing the input by pvm2sdp and so on, as well as the wrappers. An idea would be to add an extra option for a slightly different input format, allowing for multiple blocks, while keeping 2 blocks per constraint as the default. Or an extra input file specifying the number of blocks per constraint, with 2 blocks per constraint as the default when that file is not found.

If it will be low on your priority list and the assumption is not very wide spread, I might give it a try myself. In that case, any advice would be appreciated (also because I only know the basics of programming in C++)!

EDIT: added the ^n for the multivariate setting

davidsd commented 3 years ago

Hi Nando,

Do I understand correctly that in your extended version of the positivity constraint, x is now a vector taking values in R^n? So would the bilinear basis now be given by multivariate polynomials too?

Another question: in the multivariate case, isn't there a difference between sum of squares and positivity? A semidefinite program solver can at most investigate whether something is a sum of squares, but this may result in sub-optimal bounds. Is this good enough for your desired application?

David

nanleij commented 3 years ago

Hi, You're right, x is indeed taking values in R^n, and the bilinear basis is given by multivariate polynomials. I see now that i forgot the ^n.

As a bit of context: For polynomials (so not polynomial matrices), Putinar's theorem states the following: (with x in R^n, and g_0(x)=1)

Let p(x) > 0 on a set S_G = {x in R^n : g_i(x) >= 0 for all i = 1, ..., L}, with Mg = { sum{i=0}^L g_i(x) s_i(x) : s_i(x) is a sum of squares} being archimedean (there is an N such that N- ||x||^2 in M_G, basically a certificate that S_G is bounded). Then p(x) in M_G.

Conversely, p(x) in M_G implies that p(x) \geq 0 on S_G.

So a constraint p(x) > 0 on S_G converts to p(x) = \sum_i g_i(x) s_i(x) with the s_i being sums of squares of polynomials. As p(x) + epsilon >0 for all epsilon>0 and p(x)>=0, this can be used to approximate the constraint p(x) >=0. For the actual optimization, you also need to set a restriction on the maximum degree of the s_i. There still is convergence to the original problem (with p(x)>=0) when increasing the degree. Because any feasible solution gives an upper resp. lower bound on the value (for minimization resp. maximization), this is fine in certain applications such as getting a bound on the maximum sphere packing density, or on the maximal size of a spherical code.I am looking at these kinds of applications, for which an upper bound is good enough.

For polynomial matrices, you can prove a very similar theorem (as done by Klep and Schweighofer in arXiv:0907.2260, theorem 13), connecting positive definiteness of a multivariate polynomial matrix M(x) on a set S_G to being of the form \sum_i g_i(x) S(x)^T S(x), which can be modeled as stated in my first post.

Nando

wlandry commented 3 years ago

whereas in the multivariate case you need (for positive definiteness on a set equation): equation where equation are polynomials.

I am a bit confused here. Just looking at the case of two variables, x_0 and x_1, I would expect cross terms? Something like M=Tr(Y(QxI)) + x_0 Tr(Y(QxI)) + x_1 Tr(Y(QxI)) + x_0 x_1 Tr(Y(QxI)) Or are you working with a subset of problems that do not have cross terms?

So you really need the option to add a different number of Y blocks per constraint (blocks_j contains more than 2 elements).

However, it seems like this is not supported at the moment. In the input format (the files pvm2sdp generates from xml), blocks.i and bilinear_basis.i assume 2 blocks per constraint, and also during the reading of those files it is assumed (the part with parity in sdpb/src/sdp_solve/SDP/SDP/read_bilinear_basis.cxx seems to have to do with whether it is the first or second block of a constraint). In most computations, it looks like the information about the blocks comes from block_info; if that's the case, adding support for an arbitrary number of blocks might not be too difficult.

This raises a few questions:

1. To what extend is the assumption used in the solver itself?

It is pretty ubiquitous, sometimes in subtle ways. For example, some arrays are twice as big as others. Generalizing it would not be trivial.

2. Can it be implemented in the (near) future?  A problem might be the input format. Not only the reading would have to change, but also writing the input by `pvm2sdp` and so on, as well as the wrappers. An idea would be to add an extra option for a slightly different input format, allowing for multiple blocks, while keeping 2 blocks per constraint as the default. Or an extra input file specifying the number of blocks per constraint, with 2 blocks per constraint as the default when that file is not found.

If it will be low on your priority list and the assumption is not very wide spread, I might give it a try myself. In that case, any advice would be appreciated (also because I only know the basics of programming in C++)!

It is not on my priority list, and this is the first time I have heard of someone needing this. I have tried to make the code understandable, but if you are not reasonably familiar with C++, you are going to have a difficult time. If you can accomplish what you need by generating alternate input files, that would be more feasible. For that, you can write that using whatever tools you are most familiar with. The format is pretty simple.

One workaround would be to use a lot of constant constraints. You could create a grid that covers the multivariate space. After you get a solution, you can verify that it is positive everywhere. I am working on automating that kind of procedure, but it is not ready yet.

nanleij commented 3 years ago

The strength of Putinar's theorem is that it removes the need of cross terms, with the extra condition on MG = { sum{i=0}^L g_i(x) s_i(x) : s_i(x) is a sum of squares}, namely that the polynomial N-||x||^2 is in M_G. There's also Schmudgens Positiv stellensatz, stating that if a multivariate polynomial p(x)>0 on a set S_G {x in R^n : g_i(x) >=0}, and S_G is compact (closed and bounded in this case), then p(x) = sum product_i g_i(x)^a_i s(x), i.e. you get the cross terms. Putinars theorem is slightly less general because you need a condition on M_G, so on the polynomials defining S_G.

As an example, consider n=2, with a set S_G = {(x,y) : 1- x^2 >= 0, 1-y^2 >=0 } = [-1,1]^2, the unit square, which is compact, and M_G is archimedean. Then Schmudgens theorem says that a strictly positive polynomial on S_G is of the form p(x) = s_0(x,y) + (1-x^2)s_1(x,y) + (1-y^2)s_2(x,y) + (1-x^2)(1-y^2)s_3(x,y), with each s_i(x,y) a sum of squares of polynomials. Putinar's theorem says that, because 2-||(x,y)||^2 = 2-x^2-y^2 is part of M_G, you don't need the cross-terms. There are still three blocks needed though, because there are three sums of squares.

Thanks for the answer, I will have a better look at possible workarounds. The higher number of blocks is only needed for multivariate polynomial (matrix) optimization, so if that is not used in the community using SDPB, it is not very unusual that you haven't heard of it. In any case, a normal SDP solver will also work, although without the speedup due to the rank 1 structure and the block structure of the schur complement matrix.