PythonOptimizers / cysparse

Python/Cython library to replace PySparse
http://PythonOptimizers.github.io/cysparse
7 stars 3 forks source link

Row**s** or column**s** scaling #236

Open syarra opened 8 years ago

syarra commented 8 years ago

PySparse has the ability to scale multiple rows/columns via the *= operand

from pysparse.sparse import PysparseMatrix as psp

J = psp(nrow=3, ncol=3)
J[0,0] = 1
J[1,1] = 2
J[2,2] = 3

# Scale two first rows:"
J[:2,:] *= -1.0

It is not possible to do that with CySparse. The same operation fails with error message:

TypeError: unsupported operand type(s) for *=: 'cysparse.sparse.ll_mat_views.ll_mat_view_INT64_t_FLOAT64_t.LLSparseMatrixView_INT64_t_FLOAT64_t' and 'float'

Is there a way to efficiently do the same operation? (without having to call J.row_scale multiple times)

@counterclocker

ghost commented 8 years ago

This is possible in PySparse because of the way "views" are implemented... I'll see what I can do to implement this efficiently in CySparse. As is, CySparse does not support this and calling row_scale multiple times is not efficient.

ghost commented 8 years ago

LLSparseMatrixViews in CySparse are more general and as such bear more complexity.

ghost commented 8 years ago

There is no way to do this in an easy way.

One problem is that you can basically use any combinations of rows and columns (including duplicate ones)). Does in place operations make sense when you duplicate a row? Probably not. And if it does, it is not easy to implement in place operations efficiently.

For the moment, only in place assignment is implemented and not very efficiently. And it is only valid for same sized matrix/view.

It is really worth exploring how to do in place operations?

dpo commented 8 years ago

Scaling the rows or columns of a matrix is quite important. Technically, it can be done by pre- or post-multiplying A with a diagonal matrix. Can we do

A *= D

efficiently, where D is sparse and diagonal?

syarra commented 8 years ago

Yes, it could be done:

D = BandLLSparseMatrix(size=3, diag_coeff=[0], numpy_arrays=[np.array([-1, -1, 1], dtype=np.float64)])

H *= D

It returns a proxy, so no inplace scaling I don't know how efficient this is, though!

ghost commented 8 years ago

H *= D is not allowed (not to my knowledge at least ;-) ).

H = H * D is allowed and but is not efficient for simple scaling.

It is clear for me that the matrix proxy classes should be completed with some in place operations (and maybe some other basic operations (multiplication? addition?)). I don't know when I can do this. My pipeline of tasks is already full for the next couple of months. What are the priorities?

syarra commented 8 years ago

H *= D is not allowed but H = H*D is good enough for now!

I only want to use this kind of operation for multiple row scaling at the same time.

ghost commented 8 years ago

Is it possible to give me all operations that need to be implemented for LLSparseMatrixViews?

This class needs a complete rewriting because I didn't understand exactly what was needed or not. For the moment, it serves as a view to retrieve elements or to assign elements but only one by one which is of course not very clever. Note that doing batch assignments is very difficult to do efficiently.

Here is my current list:

Do we need to use a view more like a matrix, for instance allowing:

Something else?

What do you think?

@syarra @dpo

ghost commented 8 years ago

By the way, we started this discussion with LLSparseMatrixViews but then switched to LLSparseMatrixes for which col_scale() and row_scale() are efficiently implemented. This is done one column/row at a time. I don't dare to ask if this is enough because doing scaling in batches efficiently would require some work.