sympy / sympy

A computer algebra system written in pure Python
https://sympy.org/
Other
12.81k stars 4.39k forks source link

allow for rhs matrix to be passed to rref #25661

Open smichr opened 1 year ago

smichr commented 1 year ago

https://stackoverflow.com/questions/77061999/reducing-augmented-matrices-in-sympy/77065104#77065104 identifies the use case where someone would like to see the modified rhs obtained while reducing an arbitrary matrix. If an augmented matrix has the rref method run, it knows nothing about the semantics of the input and tries to make the top left square of the matrix an identity matrix and zeros out all values below that (with appropriate operations):

>>> Ab = Matrix([
[ 1,  0, x],
[-1,  1, y],
[ 0, -1, z],
[ 1,  1, t]])
>>> Ab.rref()

(Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[0, 0, 0]]), (0, 1, 2))

It would be nice if the augmented columns could be passed separately (or the size of the square be indicated) so processing would only occur in the desired columns so

>>> A = Ab[:,:-1]
>>> b = Ab[:,-1]
>> A.rref(rhs=b)  # or Ab.rref(cols=2)
(Matrix([
[1, 0,              x],
[0, 1,         x + y],
[0, 0,    x + y + z],
[0, 0, -2*x - y + t]], <pivots, too>)

The SO issue referenced above has an answer that handles this and uses the rank of the A matrix to determine the size of the identity matrix to stack onto A (with zeros above it) before processing.

sylee957 commented 1 year ago

I think that the semantics can be changed to give two matrix instead of one, to give more easier interpretation, and to match the shape of the input exactly

Matrix([[1, 0], [0, 1], [0, 0], [0, 0]]), Matrix([x, x+y, x+y+z, -2*x-y+t])

Also, the implementation should stop eliminating the columns in RHS

oscarbenjamin commented 1 year ago

Also, the implementation should stop eliminating the columns in RHS

This is needed for all of SymPy's internal use of rref as well as for users.

I think that rref (and echelon_form and anything else similar) should have a parameter that can be used to specify how many columns represent the rhs like Maug.rref(rhscols=1) or something. Another possibility as you say would be like Mr, br = M.rref(b) or _, Minv = M.rref(eye(*M.shape)).

Is there any prior art for this in other systems?

sylee957 commented 1 year ago

Is there any prior art for this in other systems?

I'm not sure about how this is implemented in other systems, however, I'm only sure that libraries like flint have no such options https://fredrikj.net/python-flint/fmpq_mat.html#flint.fmpq_mat.rref so we can have some trouble to implement some options for them (flint may not even support pivots outputs)

I think that the matrix augmentation can be done like:

A = Matrix([[1, 0], [-1, 1], [0, -1], [1, 1]])
b = Matrix([x, y, z, t])
Ab = Matrix.hstack(A, Matrix.eye(A.rows), b)
rref, _ = Ab.rref()
Matrix.hstack(rref[:, :A.cols], rref[:, A.rows + A.cols:])

which can be useful for some polyfilling purporse, or defining the invariants for the tests.

oscarbenjamin commented 1 year ago

(flint may not even support pivots outputs)

No, it doesn't but the pivots output is mostly just a bad idea anyway.

oscarbenjamin commented 1 year ago

In context it would be okay to fall back to a potentially slower method if Flint could not be used.

sylee957 commented 1 year ago

Although the implementation could be easy in theory, because it may just need to free some parameter for iteration (like cols), However, I think that it is quite difficult to pass the parameters correctly because the implementation is quite deep. Here, I just implemented in _row_reduce because DomainMatrix rref is not working automatically for the example, but there may need implementation for 4 DomainMatrix methods as well.

diff --git a/sympy/matrices/matrices.py b/sympy/matrices/matrices.py
index 6dae66671b..d6fdccd710 100644
--- a/sympy/matrices/matrices.py
+++ b/sympy/matrices/matrices.py
@@ -176,10 +176,22 @@ def is_echelon(self):
     def rank(self, iszerofunc=_iszero, simplify=False):
         return _rank(self, iszerofunc=iszerofunc, simplify=simplify)

-    def rref(self, iszerofunc=_iszero, simplify=False, pivots=True,
-            normalize_last=True):
-        return _rref(self, iszerofunc=iszerofunc, simplify=simplify,
-            pivots=pivots, normalize_last=normalize_last)
+    def rref(
+        self,
+        iszerofunc=_iszero,
+        simplify=False,
+        pivots=True,
+        normalize_last=True,
+        max_cols=float('inf')
+    ):
+        return _rref(
+            self,
+            iszerofunc=iszerofunc,
+            simplify=simplify,
+            pivots=pivots,
+            normalize_last=normalize_last,
+            max_cols=max_cols
+        )

     echelon_form.__doc__ = _echelon_form.__doc__
     is_echelon.__doc__   = _is_echelon.__doc__
diff --git a/sympy/matrices/reductions.py b/sympy/matrices/reductions.py
index b2622b2e5c..672fd0325a 100644
--- a/sympy/matrices/reductions.py
+++ b/sympy/matrices/reductions.py
@@ -7,8 +7,18 @@
 from .determinant import _find_reasonable_pivot

-def _row_reduce_list(mat, rows, cols, one, iszerofunc, simpfunc,
-                normalize_last=True, normalize=True, zero_above=True):
+def _row_reduce_list(
+    mat,
+    rows,
+    cols,
+    one,
+    iszerofunc,
+    simpfunc,
+    max_cols: float,
+    normalize_last=True,
+    normalize=True,
+    zero_above=True,
+):
     """Row reduce a flat list representation of a matrix and return a tuple
     (rref_matrix, pivot_cols, swaps) where ``rref_matrix`` is a flat list,
     ``pivot_cols`` are the pivot columns and ``swaps`` are any row swaps that
@@ -63,7 +73,7 @@ def cross_cancel(a, i, b, j):
     swaps = []

     # use a fraction free method to zero above and below each pivot
-    while piv_col < cols and piv_row < rows:
+    while piv_col < max_cols and piv_row < rows:
         pivot_offset, pivot_val, \
         assumed_nonzero, newly_determined = _find_reasonable_pivot(
                 get_col(piv_col)[piv_row:], iszerofunc, simpfunc)
@@ -121,12 +131,27 @@ def cross_cancel(a, i, b, j):

 # This functions is a candidate for caching if it gets implemented for matrices.
-def _row_reduce(M, iszerofunc, simpfunc, normalize_last=True,
-                normalize=True, zero_above=True):
-
-    mat, pivot_cols, swaps = _row_reduce_list(list(M), M.rows, M.cols, M.one,
-            iszerofunc, simpfunc, normalize_last=normalize_last,
-            normalize=normalize, zero_above=zero_above)
+def _row_reduce(
+    M,
+    iszerofunc,
+    simpfunc,
+    normalize_last=True,
+    normalize=True,
+    zero_above=True,
+    max_cols=float('inf')
+):
+    mat, pivot_cols, swaps = _row_reduce_list(
+        list(M),
+        M.rows,
+        M.cols,
+        M.one,
+        iszerofunc,
+        simpfunc,
+        normalize_last=normalize_last,
+        normalize=normalize,
+        zero_above=zero_above,
+        max_cols=max_cols
+    )

     return M._new(M.rows, M.cols, mat), pivot_cols, swaps

@@ -292,8 +317,15 @@ def _rref_dm(dM):
     return M_rref, pivots

-def _rref(M, iszerofunc=_iszero, simplify=False, pivots=True,
-        normalize_last=True):
+def _rref(
+    M,
+    *,
+    max_cols: float,
+    iszerofunc=_iszero,
+    simplify=False,
+    pivots=True,
+    normalize_last=True,
+):
     """Return reduced row-echelon form of matrix and indices of pivot vars.

     Parameters
@@ -377,8 +409,15 @@ def _rref(M, iszerofunc=_iszero, simplify=False, pivots=True,
         else:
             simpfunc = _simplify

-        mat, pivot_cols, _ = _row_reduce(M, iszerofunc, simpfunc,
-                normalize_last, normalize=True, zero_above=True)
+        mat, pivot_cols, _ = _row_reduce(
+            M,
+            iszerofunc,
+            simpfunc,
+            normalize_last,
+            normalize=True,
+            zero_above=True,
+            max_cols=max_cols
+        )

     if pivots:
         return mat, pivot_cols
(END)
smichr commented 1 year ago

This is the approach from the SO reference:

diff --git a/sympy/matrices/matrices.py b/sympy/matrices/matrices.py
index 12bf8524f5..256fd8c4c0 100644
--- a/sympy/matrices/matrices.py
+++ b/sympy/matrices/matrices.py
@@ -177,7 +177,24 @@ def rank(self, iszerofunc=_iszero, simplify=False):
         return _rank(self, iszerofunc=iszerofunc, simplify=simplify)

     def rref(self, iszerofunc=_iszero, simplify=False, pivots=True,
-            normalize_last=True):
+            normalize_last=True, rhs=None):
+        if rhs is not None:
+            A = self
+            B = rhs
+            n, m = A.rows, A.cols
+            p,q = B.rows, B.cols
+            if n != p:
+                raise ValueError('rhs must have same number of rows as self')
+            r = A.rank()
+            if r < n:
+                A11 = A[:r, :]
+                A21 = A[r:, :]
+                O = self.zeros(r, n - r)
+                I = self.eye(n - r)
+                Apad = self.__new__(A, [[A11, O],[A21, I]])
+                R, pv = self.hstack(Apad, B).rref()       
+                return self.hstack(R[:,:m],R[:,-q:]), pv[:r]
+            return self.hstack(A, B).rref()
         return _rref(self, iszerofunc=iszerofunc, simplify=simplify,
             pivots=pivots, normalize_last=normalize_last)

Examples

>>> B = Matrix(3,1,var('b_1:4'))
>>> a=Matrix([
... [1,  1, -1],
... [1, -1,  0],
... [0,  1,  1]])
>>> a.rref(rhs=B)
(Matrix([
[1, 0, 0,  b_1/3 + 2*b_2/3 + b_3/3],
[0, 1, 0,    b_1/3 - b_2/3 + b_3/3],
[0, 0, 1, -b_1/3 + b_2/3 + 2*b_3/3]]), (0, 1, 2))
>>> A
Matrix([
[1, 3, 0, 2],
[0, 0, 1, 4],
[1, 3, 1, 6]])
>>> A.rref(rhs=B)
(Matrix([
[1, 3, 0, 2,              b_1],
[0, 0, 1, 4,              b_2],
[0, 0, 0, 0, -b_1 - b_2 + b_3]]), (0, 2))
smichr commented 1 year ago

Do we need this capability at the lower levels, though? Wouldn't this higher level implementation be sufficient for this purpose?

sylee957 commented 1 year ago

Okay, I think that it is reasonable to do with matrix augmentation trick in higher level. But I just want to get more into details about what should be the best answer for augmenting the matrix. I want to follow some mathematics here.

smichr commented 1 year ago

Note, too, that the size of the addition identity matrix that you need to stack depends on the rank of A.

get more into details about what should be the best answer for augmenting the matrix

Do we need to do this at a lower level? It seems to me that we can let the lower level routines continue doing what they do with the full matrix after adding some extra columns at the point of entry so they do the right thing as far as we are concerned.

sylee957 commented 1 year ago

Note, too, that the size of the addition identity matrix that you need to stack depends on the rank of A.

I'm not sure if the augmented matrix [[A11, O],[A21, I]] is always the full rank. I originally suggested to pad always the identity matrix because that is the only thing I can prove to make the matrix full rank, and the mathematical proof is trivial.

oscarbenjamin commented 1 year ago

Do we need to do this at a lower level?

It is definitely simpler not to do this at the lower-level but there are also potential reasons to want it there because it could be more efficient. It only really comes into play when attempting to solve an inconsistent system of equations but an error could be raised more quickly if it is known that only the first N columns should be used for reduction.

Also if the goal is to determine the symbolic conditions for consistency then it is better for those not to be normalised away like rref does.

sylee957 commented 1 year ago

https://github.com/sympy/sympy/issues/25661#issuecomment-1716873692

I think that your suggestion gives wrong result for some cases In this example, it also has matrix rank 2, but it eliminates the symbols at the right hand side


x, y = symbols('x y')
a, b, c, d = symbols('a b c d')
A = Matrix([[0, 0], [0, 0], [1, 2], [3, 4]])
rhs = Matrix([a, b, c, d])
A.rref(rhs=rhs)
(Matrix([
 [1, 0, 0],
 [0, 1, 0],
 [0, 0, 1],
 [0, 0, 0]]),
 (0, 1))
oscarbenjamin commented 1 year ago

I think that handling this at the higher level by padding with an identity matrix seems fine. It won't add any significant computational complexity. It could be done at the level of the DomainMatrix rref and rref_den methods rather than for each of the lower-level implementations.

Given the OP usecase it make sense to do this for echelon_form as well.

smichr commented 1 year ago

Note, too, that the size of the addition identity matrix that you need to stack depends on the rank of A.

I'm not sure if the augmented matrix [[A11, O],[A21, I]] is always the full rank. I originally suggested to pad always the identity matrix because that is the only thing I can prove to make the matrix full rank, and the mathematical proof is trivial.

Thanks for your patience in bringing me back to what I missed; not working extensively with matrices, these fundamental ideas are not adequately appreciated by me.

smichr commented 1 year ago

I really don't understand the labyrinth of routines to handle rref. First simple thing that doesn't make sense is that the docstring is put in a private method. Why isn't it in the public rref method?

Second, the functionality as implemented in the public rref makes sense to me because after the full matrix gets passed to the private methods they just convert to the appropriate domain and return the result.

BUT I have no idea why or how the errors are being raised during testing:

TypeError: test_sparse_sdm_rref_den() missing 4 required positional arguments: '
name', 'A', 'A_rref', and 'den'

Is there an easy introduction to the concept I am missing here?

donnaaboise commented 1 year ago

(I am the original poster of the SO question and have been following the discussion above)

Here are few thoughts on some of the issues above.

It seems that within the context of rref(), SymPy assumes that symbolic variables are non-zero.

a,b,c = sp.symbols('a,b,c')
A = sp.Matrix(3,3,[a,0,0, 0, b, 0,0,0,c])

print("rank : ", A.rank());
pprint(A.rref())
rank :  3
Matrix([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])

Maybe this is the right thing to do? If I include a right hand side, sp.rref() shows its work :

rref_aug(A,B) = 
Matrix([
[1, 0, 0, b_1/a],
[0, 1, 0, b_2/b],
[0, 0, 1, b_3/c]])

Even In the rank deficient case, both A.rref() and the augmented matrix approach gives the expected result, as long as the first r rows are pivot rows.

A = 
Matrix([
[  a, 1,   0],
[  0, 0, b*c],
[a*b, b,   0]])

A.rank() =  2

A.rref()
Matrix([
[1, 1/a, 0],
[0,   0, 1],
[0,   0, 0]])

rref_aug(A,B) = 
Matrix([
[1, 1/a, 0,        b_1/a],
[0,   0, 1,    b_2/(b*c)],
[0,   0, 0, -b*b_1 + b_3]])

If the pivot rows do not appear in the first r rows, the augmented matrix approach will normalize the last column to obtain an additional pivot row.

rref_aug(A,B) = 
Matrix([
[1, 1/a, 0, 0],
[0,   0, 1, 0],
[0,   0, 0, 1]])
oscarbenjamin commented 1 year ago

It seems that within the context of rref(), SymPy assumes that symbolic variables are non-zero.

More precisely SymPy will try to find a result that is valid for generic values of the parameters i.e. for almost all possible (real or complex) values of the symbols the result is supposed to be correct. The set of values for which the result is not valid should be of lower dimension than the set of values for which the result is valid.

Using the lower-level DomainMatrix class you can compute the RREF along with the expression that would be assumed to be nonzero:

In [31]: a, b, c = symbols('a, b, c')

In [32]: M = Matrix([[a-1, 0, 0], [0, b-2, 0], [0, 0, c-3]])

In [33]: M.rref()[0]
Out[33]: 
⎡1  0  0⎤
⎢       ⎥
⎢0  1  0⎥
⎢       ⎥
⎣0  0  1⎦

In [34]: r, d, _ = M.to_DM().rref_den()

In [35]: r.to_Matrix()
Out[35]: 
⎡a⋅b⋅c - 3⋅a⋅b - 2⋅a⋅c + 6⋅a - b⋅c + 3⋅b + 2⋅c - 6                          0                       ↪
⎢                                                                                                   ↪
⎢                        0                          a⋅b⋅c - 3⋅a⋅b - 2⋅a⋅c + 6⋅a - b⋅c + 3⋅b + 2⋅c - ↪
⎢                                                                                                   ↪
⎣                        0                                                  0                       ↪

↪                             0                        ⎤
↪                                                      ⎥
↪  6                          0                        ⎥
↪                                                      ⎥
↪     a⋅b⋅c - 3⋅a⋅b - 2⋅a⋅c + 6⋅a - b⋅c + 3⋅b + 2⋅c - 6⎦

In [36]: d
Out[36]: a*b*c - 3*a*b - 2*a*c + 6*a - b*c + 3*b + 2*c - 6

In [37]: d.as_expr().factor()
Out[37]: (a - 1)⋅(b - 2)⋅(c - 3)

Now provided d is nonzero the RREF is:

In [38]: (r.to_field()/d).to_Matrix()
Out[38]: 
⎡1  0  0⎤
⎢       ⎥
⎢0  1  0⎥
⎢       ⎥
⎣0  0  1⎦

In this case within the 3D (real or complex) (a,b,c) space there are three 2D planes on which d = 0 given by a = 1, b = 2 or c = 3. These make a set of measure zero within the full 3D space in which for almost all possible values of the symbols a,b,c the result returned by M.rref() is correct.

oscarbenjamin commented 1 year ago

Using the lower-level DomainMatrix class you can compute the RREF along with the expression that would be assumed to be nonzero:

Note that this demonstration uses the latest SymPy from git (it won't work without alteration for SymPy 1.12).

smichr commented 1 year ago

@donnaaboise , I was just writing an invitation on SO to review the PR #25687, but I see that you found your way here. If you have any ideas related to the PR itself, please feel free to comment there.