problem: error in prob_to_socp.py

steve3141 commented 10 years ago

Help will be appreciated!

Simple reproducible example: problem is

parameter mu(2)
variable x(2)
minimize( norm(x) )
mu'*x==1

The basic problem is that mu is converted to a coo_matrix in the following lines, where the column indices are all set to 1 rather than 0:

    params['mu'] = sp.coo_matrix(params['mu'])
    Ai.append((idx for idx in params['mu'].col))
    Aj.append((1 + 1*idx for idx in params['mu'].row)) ### WHY "1 +" ???
    Av.append((v for v in params['mu'].data))

This leads to an out-of-bounds error:

  File "\\ms\dist\python\PROJ\scipy\0.13.2-py27\lib\scipy\sparse\coo.py", line 225, in _check
    raise ValueError('row index exceedes matrix dimensions')

This happens at the following line in prob_to_socp:

    if p > 0: A = sp.csc_matrix((Av, np.vstack((Ai, Aj))), (p,n))

where p = 1 and n = 3

steve3141 commented 10 years ago

Sorry, my analysis was mistaken, there's nothing wrong with the 1+ since this is a sparse matrix representation. I think the problem is that the rows and columns (Ai, Aj) have been reversed. The error is saying that Ai has indices too large for the number of rows specified, which is p=1. Which is true because Ai has been filled with the column indices of the original mu row Vector. I've gone through the example and test problems and there doesn't seem to be any that use an affine constraint, so I believe this is a genuine bug.

steve3141 commented 10 years ago

And the problem can be reproduced even more simply by leaving out the objective function:

parameter mu(2)
variable x(2)
mu'*x==1

steve3141 commented 10 years ago

And the same problem occurs generating a cone constraint (ie affecting the G matrix rather than the A matrix):

parameter mu(2)
variable x(2)
norm(mu'*x) <= 1

echu commented 10 years ago

Thanks for pointing this out! I'll look into it!

On Thursday, July 3, 2014, steve3141 notifications@github.com wrote:

Sorry, my analysis was mistaken, there's nothing wrong with the 1+ since this is a sparse matrix representation. I think the problem is that the rows and columns (Ai, Aj) have been reversed. The error is saying that Ai has indices too large for the number of rows specified, which is p=1. Which is true because Ai has been filled with the column indices of the original mu row Vector. I've gone through the example and test problems and there doesn't seem to be any that use an affine constraint, so I believe this is a genuine bug.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/qcml/issues/45#issuecomment-47951915.

steve3141 commented 10 years ago

The same applies to the generated "C" code. The mu vector is being transposed when stuffed into the A matrix:

       for(i = 0; i < params->mu->nnz; ++i) *A_row_ptr++ = params->mu->j[i];
       for(i = 0; i < params->mu->nnz; ++i) *A_col_ptr++ = params->mu->i[i];

Perhaps the transposition (mu -> mu') is somehow being ignored. I haven't been able to understand the gcml codegen code yet well enough to identify the true source of the problem. Help would be appreciated! It looks like a workaround is to supply the 'mu' vector in transposed form. This works also for the Python generated code, eg

mu = numpy.array( [ [1.0], [1.0] ] )

rather than what one would expect -- and what works for Vector parameters in the objective function:

mu = numpy.array(  [1.0, 1.0] )

steve3141 commented 10 years ago

@echu: thanks for looking into this!

echu commented 10 years ago

Okay, so I did some digging. This looks like some numpy/scipy ambiguity and miscommunication on my part. The C code is correct. It's transposed when stuffed into A so that "mu" lies on a single row (note that it has a single column and the row pointer just copies over the column value, which is always "1").

When you type parameter mu(2), QCML expects a column vector of length 2, this corresponds to np.array([[1.0],[1.0]]).

In the generated code, I coerce mu to a scipy sparse matrix; try this experiment:

mu = numpy.array([1.0,1.0])
print mu.shape
print scipy.sparse.coo_matrix(mu).shape
mu = numpy.array([[1.0],[1.0]])
print mu.shape
print scipy.sparse.coo_matrix(mu).shape

QCML expects that the result be a (2,1) matrix. That's the origin of the problem. I'm open to any suggestions for fixing this, since I know it will cause headaches in the future.

steve3141 commented 10 years ago

Thanks.

So my suggested 'workaround' is actually the expected behavior? In other words, mu should be given with shape = (2,1) not (2,)?

It seems a bit inconvenient but the main problem is that it's inconsistent in different usages. There are several cases where mu must be provided as a row vector. Providing a column vector produces an error.

So long as it's consistent and documented it works (but please see my suggestion below for a more user-friendly approach.)

When used as a pure affine objective function, mu must be provided as a row vector:

parameter mu(2)
variable x(2)
minimize mu'*x

But when used in a cone objective it has to be column 'vector':

parameter mu(2)
variable x(2)
minimize norm(mu'*x)

And when used as a constraint, whether affine or cone, it must also be a column vector (as noted earlier).

There are other cases where a row vector must be provided. Consider the Lasso example in the user guide, where a Vector parameter is defined as

b = randn(m)

which is a row vector (ie shape=(m,)) In fact you get an error if you define b as a column vector.

The most user-friendly way to handle this, I think, is that qcml should simply transpose a Vector parameter if necessary. So the user can simply provide a Vector parameter either as an Nx1 or 1xN array.

steve3141 commented 10 years ago

Just to reach a bit further in terms of consistency. It's clear that variables are actually row vectors. How do I know this? That's the way they are returned from the solver. Since expressions can combine vector parameters and variables (using operations like addition) they must be of the same shape. Therefore if we have to choose the shape of vector parameters, the only consistent choice is for them to be rows.

I suggested before that the parameters could be transposed as needed but upon reflection I see that this would not be consistent with aim of QCML to generate minimal code for a given problem. So it seems that a pre-determined shape must be chosen. In which case I think, based on the remarks above, that it should be a row.

echu commented 10 years ago

Thanks for looking into this! Looks like I wasn't as consistent as I hoped to be!

On Thursday, July 3, 2014, steve3141 notifications@github.com wrote:

Just to reach a bit further in terms of consistency. It's clear that variables are actually row vectors. How do I know this? That's the way they are returned from the solver. Since expressions can combine vector parameters and variables (using operations like addition) they must be of the same shape. Therefore if we have to choose the shape of vector parameters, the only consistent choice is for them to be rows.

I suggested before that the parameters could be transposed as needed but upon reflection I see that this would not be consistent with aim of QCML to generate minimal code for a given problem. So it seems that a pre-determined shape must be chosen. In which case I think, based on the remarks above, that it should be a row.

— Reply to this email directly or view it on GitHub https://github.com/cvxgrp/qcml/issues/45#issuecomment-47986047.

steve3141 commented 10 years ago

Another example where a Vector parameter must be given as a row array is in the portfolio example (portfolio.py in the examples directory):

    n = 100   # number of assets
    m = 10    # number of factors
    mu = np.exp(randn(n))              ## A *row* vector
    F = randn(n,m)
    D = spdiags(rand(n),0,n,n)
    gamma = 1

    print "Creating portfolio problem."

    # a QCML model is specified by strings
    #   the parser parses each model line by line and builds an internal
    #   representation of an SOCP
    s = """
        dimensions m n

        variable x(n)
        parameter mu(n)
        parameter gamma positive
        parameter F(n,m)
        parameter D(n,n)
        maximize (mu'*x - gamma*(square(norm(F'*x)) + square(norm(D*x))))
            sum(x) == 1
            x >= 0
    """

steve3141 commented 10 years ago

This makes it very difficult to work with optimization problems in any kind of automated way, since some vector parameters have to be provided as rows and others must be provided as columns. I'm going to have to create some type of mapping from problem types to parameter orientation in order to make this work; hopefully QCML can instead be made consistent in this regard. Thanks again for your help.

echu commented 9 years ago

so i just checked in a change. parameter b(m) can take either a row or a column vector. i didn't thoroughly test it, but i think this is the case. this also potentially avoids some bad interactions with #52.

thanks for finding these issues. let me know if this doesn't fix your problems.

cvxgrp / qcml

problem: error in prob_to_socp.py #45