sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.41k stars 475 forks source link

Enhance matrix similarity check #18505

Closed 1d7ec08f-60ae-4512-91a6-8324c06eab9f closed 7 years ago

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 9 years ago

The addition of rational form makes it possible to always compute a canonical equivalence class representative for matrix similarity, using only field operations. Computing the similarity transformation is still limited by the necessity of using Jordan form and the resulting problem of eigenvalues lying outside of the field used for the matrix entries. Various improvements to the code and error-checking have been made.

CC: @ThomasGagne @tscrim @videlec @rbeezer

Component: linear algebra

Keywords: matrix is_similar similarity

Author: Rob Beezer

Branch/Commit: 92aac67

Reviewer: Vincent Delecroix, Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/18505

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 9 years ago

New commits:

465e22818505: enhance matrix similarity check
1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 9 years ago

Commit: 465e228

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 9 years ago

Author: Rob Beezer

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 9 years ago

Branch: u/rbeezer/is-similar

videlec commented 9 years ago

Reviewer: Vincent Delecroix

videlec commented 9 years ago
comment:3

Hello,

This looks like a very nice improvement! I am slowly digging into the algorithmic part of it. But let me first start with simple remarks.

  1. First of all, the standard for doctest continuation is now ....: instead of .... Moreover, when transformation=True and when the matrices are similar the method is_similar used to return (False, matrix) instead of (True, matrix). You are carefully ignoring the first argument in all the tests! So either it is useless and I suggest to just return None or matrix if transformation=True or test the first argument appropriately. Finally, some tiny improvements can be obtained using m.get_unsafe(i,j) instead of m[i,j]. I provided 3 commits for these that can be found at public/18505

  2. Instead as invoking QQbar if the ground field is QQ you can use the method .algebraic_closure() that also works for prime fields

sage: QQ.algebraic_closure()
Algebraic Field
sage: GF(3).algebraic_closure()
Algebraic closure of Finite Field of size 3

Though, I do not know whether your code would work in this case.

  1. I do not like the fact that for input in ZZ, the transformation matrix has coefficient in QQbar. I do not see why the base field should be extended. Conjugacy makes sense within M_n(ZZ).
sage: m = matrix(ZZ,2,[0,1,1,1])
sage: a = matrix(ZZ,2,[1,1,0,1])
sage: b = matrix(ZZ,2,[1,0,1,1])
sage: c = a*b*b*a*~b
sage: m2 = c * m * ~c
sage: m.is_similar(m2)
True
sage: m.is_similar(m2, True)
(
      [   1.000000000000000?               0.?e-18]
True, [  0.7894736842105263? -0.05263157894736842?]
)

(sided note: the above command used to return (False, matrix)... see item 0)

  1. In the case matrices are not square or have different dimension I would rather raise an error instead of returning False. Conjugacy does not make any sense for them. Moreover, everything can be tested at once using if parent(self) is not parent(other):.

  2. I learned that a lot of time is actually taken by the import statements. You can try to minimize the number of them.

  3. The usage of quo_rem looks weird. I thought that q,r = a.quo_rem(b) would be equivalent to q = a//b; r = a%b but it is not! For example (1/3) % 5 returns 2 as it is the inverse of 3 mod 5. What do you think about the semantics of // and % against the method quo_rem?

More to come...

Vincent

fchapoton commented 8 years ago
comment:4

I have some of the suggested changes.


New commits:

0c6c6a1Trac #18505: (review) new style doctest continuation
e80b474Trac #18505: (review) fix return value
2386df5Trac #18505: (review) optimization
1e7b3c4Merge branch 'public/18505' in 7.4.rc1
bb15ba6trac 18505 towards a better is_similar
25e0913trac 18505 details
fchapoton commented 8 years ago

Changed commit from 465e228 to 25e0913

fchapoton commented 8 years ago

Changed branch from u/rbeezer/is-similar to public/18505

tscrim commented 8 years ago
comment:5

Replying to @videlec:

  1. In the case matrices are not square or have different dimension I would rather raise an error instead of returning False. Conjugacy does not make any sense for them. Moreover, everything can be tested at once using if parent(self) is not parent(other):.

Strong -1 from me on this; it puts an extra burden on the user. Suppose you are checking a set of matrices where your construction creates matrices of multiple sizes and seeing which of them are similar. Then you either have to manually add checks (duplication) or wrap things with a try-except block (obfuscation). It still makes sense with the definition of similar that is_similar would return false when the (square) matrices are different sizes. The argument weakens for non-square matrices, but I still think it puts a completely unnecessary burden on the user.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 25e0913 to 08ddbd1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

08ddbd1trac 18505 back to False for mismatched sizes
fchapoton commented 8 years ago
comment:7

As a compromise, I propose to raise ValueError for non-square matrices and return False for mismatched square matrices.

tscrim commented 8 years ago
comment:8

I don't particularly like it, but I can live with that (which means it must be a good compromise :P).

fchapoton commented 8 years ago
comment:9

setting to major, because I am going to need this.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 08ddbd1 to e12f614

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e12f614trac 18505 adding a new kind of example
tscrim commented 8 years ago
comment:11

While I would like a better solution to Vincent's point 3, but I suspect solving that would require some effort. So it's not something that would block me from setting this to a positive review. Vincent, Rob, any other comments or issues?

videlec commented 8 years ago
comment:12

Replying to @tscrim:

Replying to @videlec:

  1. In the case matrices are not square or have different dimension I would rather raise an error instead of returning False. Conjugacy does not make any sense for them. Moreover, everything can be tested at once using if parent(self) is not parent(other):.

Strong -1 from me on this; it puts an extra burden on the user. Suppose you are checking a set of matrices where your construction creates matrices of multiple sizes and seeing which of them are similar. Then you either have to manually add checks (duplication) or wrap things with a try-except block (obfuscation). It still makes sense with the definition of similar that is_similar would return false when the (square) matrices are different sizes. The argument weakens for non-square matrices, but I still think it puts a completely unnecessary burden on the user.

It is not only a burden. It is a problem of well defined specifications.

With your proposition, a False answer would have two meanings. If a question does not make sense it should really be an error. It is like asking 1 / 0 == 3 (that indeed raise an error in Python). As a consequence of such behavior things such as m.is_similar(1) would be ambiguous (is there an implicit coercion in this command? In other words, if the answer is False does it mean that my matrix is not similar to the identity?).

With your pseudo use-case, it is very easy to sort first by parents and then check for similarity.

sage: elts = [1, 2/3, 5, 5/2, 3.4, 0.0]
sage: from collections import defaultdict
sage: d = defaultdict(list)
sage: for e in elts: d[e.parent()].append(e)
sage: d[RR]
[3.40000000000000, 0.000000000000000]
sage: d[ZZ]
[1, 5]
sage: d[QQ]
[2/3, 5/2]
fchapoton commented 8 years ago
comment:13

I am ready to go back to raising error for square matrices of different sizes.

But first, do yo have any other comment ?

The thing we really need (in another ticket) is a canonical form algo that gives the change-of-basis matrix. It seems that m*gma has something like that, with a reference.

tscrim commented 8 years ago
comment:14

Replying to @videlec:

Replying to @tscrim:

Replying to @videlec:

  1. In the case matrices are not square or have different dimension I would rather raise an error instead of returning False. Conjugacy does not make any sense for them. Moreover, everything can be tested at once using if parent(self) is not parent(other):.

Strong -1 from me on this; it puts an extra burden on the user. Suppose you are checking a set of matrices where your construction creates matrices of multiple sizes and seeing which of them are similar. Then you either have to manually add checks (duplication) or wrap things with a try-except block (obfuscation). It still makes sense with the definition of similar that is_similar would return false when the (square) matrices are different sizes. The argument weakens for non-square matrices, but I still think it puts a completely unnecessary burden on the user.

It is not only a burden. It is a problem of well defined specifications.

With your proposition, a False answer would have two meanings. If a question does not make sense it should really be an error. It is like asking 1 / 0 == 3 (that indeed raise an error in Python). As a consequence of such behavior things such as m.is_similar(1) would be ambiguous (is there an implicit coercion in this command? In other words, if the answer is False does it mean that my matrix is not similar to the identity?).

The specifications are well-defined. As I have learned it: A is similar to B if there exist an invertible matrix P such that A = PBP-1. This can be interpreted as A and B are similar iff they are the same linear transformation. So are you saying it should raise an error when checking if two maps are equal if have different (co)domains?

I am still leaning towards non-square matrices returning False and considering the error being raised a regression.

You're arguments are unconvincing to me because checking similarity of different sized matrices is not an invalid mathematical comparison (much less an operation). For m.is_similar(1), this is comparing a matrix to something which is not a matrix, and an error would be appropriate in this case.

With your pseudo use-case, it is very easy to sort first by parents and then check for similarity.

sage: elts = [1, 2/3, 5, 5/2, 3.4, 0.0]
sage: from collections import defaultdict
sage: d = defaultdict(list)
sage: for e in elts: d[e.parent()].append(e)
sage: d[RR]
[3.40000000000000, 0.000000000000000]
sage: d[ZZ]
[1, 5]
sage: d[QQ]
[2/3, 5/2]

You've done an import statement and some more advanced Python-foo. There is no way I could teach that to my linear algebra students or to an average Sage user.

Actually, does anyone know what the other Ma's do?

videlec commented 8 years ago
comment:15

Replying to @tscrim:

Replying to @videlec:

Replying to @tscrim:

Replying to @videlec:

  1. In the case matrices are not square or have different dimension I would rather raise an error instead of returning False. Conjugacy does not make any sense for them. Moreover, everything can be tested at once using if parent(self) is not parent(other):.

Strong -1 from me on this; it puts an extra burden on the user. Suppose you are checking a set of matrices where your construction creates matrices of multiple sizes and seeing which of them are similar. Then you either have to manually add checks (duplication) or wrap things with a try-except block (obfuscation). It still makes sense with the definition of similar that is_similar would return false when the (square) matrices are different sizes. The argument weakens for non-square matrices, but I still think it puts a completely unnecessary burden on the user.

It is not only a burden. It is a problem of well defined specifications.

With your proposition, a False answer would have two meanings. If a question does not make sense it should really be an error. It is like asking 1 / 0 == 3 (that indeed raise an error in Python). As a consequence of such behavior things such as m.is_similar(1) would be ambiguous (is there an implicit coercion in this command? In other words, if the answer is False does it mean that my matrix is not similar to the identity?).

The specifications are well-defined. As I have learned it: A is similar to B if there exist an invertible matrix P such that A = PBP-1. This can be interpreted as A and B are similar iff they are the same linear transformation. So are you saying it should raise an error when checking if two maps are equal if have different (co)domains?

For me it is rather an algebraic statements. And the question Is A similar to B in X makes sense in any given algebra X and two of its elements A and B. And the algebra X is definitely part of the question (see point 3 in comment:3). If there is no "natural" algebra containing A and B the question is not well defined. In Sage, elements have parents and it should be expected that we are answering the question "is similar" (or "is conjugate") in their common parent (involving coercion if needed).

As you suggested "similar" is mostly applied for linear algebra and the appropriate adjective in general is "conjugate". But for me, is_similar should just be an alias of is_conjugate in the case of matrices.

What you are suggesting is a is_similar with the following behavior

def is_similar(self, other):
    try:
        return self.is_conjugate(other)
    except Exception:
        return False

where the is_conjugate would be defined as I said. Do you think we should have both is_conjugate and is_similar?

I am still leaning towards non-square matrices returning False and considering the error being raised a regression.

You're arguments are unconvincing to me because checking similarity of different sized matrices is not an invalid mathematical comparison (much less an operation). For m.is_similar(1), this is comparing a matrix to something which is not a matrix, and an error would be appropriate in this case.

You are just missing the point. The identity matrix and the integer 1 are considered equal in Sage

sage: identity_matrix(10) == 1
True

Actually, does anyone know what the other Ma's do?

At least I found

tscrim commented 8 years ago
comment:16

Replying to @videlec:

Replying to @tscrim:

The specifications are well-defined. As I have learned it: A is similar to B if there exist an invertible matrix P such that A = PBP-1. This can be interpreted as A and B are similar iff they are the same linear transformation. So are you saying it should raise an error when checking if two maps are equal if have different (co)domains?

For me it is rather an algebraic statements. And the question Is A similar to B in X makes sense in any given algebra X and two of its elements A and B. And the algebra X is definitely part of the question (see point 3 in comment:3). If there is no "natural" algebra containing A and B the question is not well defined. In Sage, elements have parents and it should be expected that we are answering the question "is similar" (or "is conjugate") in their common parent (involving coercion if needed).

The natural algebra in this case would be the product algebra over a common (larger) base ring.

As you suggested "similar" is mostly applied for linear algebra and the appropriate adjective in general is "conjugate". But for me, is_similar should just be an alias of is_conjugate in the case of matrices.

What you are suggesting is a is_similar with the following behavior

def is_similar(self, other):
    try:
        return self.is_conjugate(other)
    except Exception:
        return False

where the is_conjugate would be defined as I said. Do you think we should have both is_conjugate and is_similar?

I think they should be aliases because x.is_conjugate(y) should be the same as y in x.conjugacy_class(), and the latter of which would not result in an error. However, I am probably able live with a stricter is_conjugate, although this has a slight name conflict with complex conjugate, which we shorten to conjugate.

I am still leaning towards non-square matrices returning False and considering the error being raised a regression.

You're arguments are unconvincing to me because checking similarity of different sized matrices is not an invalid mathematical comparison (much less an operation). For m.is_similar(1), this is comparing a matrix to something which is not a matrix, and an error would be appropriate in this case.

You are just missing the point. The identity matrix and the integer 1 are considered equal in Sage

sage: identity_matrix(10) == 1
True

I guess since we are allowing coercion of the arguments, then m.is_similar(1) should coerce to a common parent, which is m.parent() and then check. However, it should not result in an error if no common parent is found, it should simply be false. What if m and n are the same size, but had no common base ring, should that raise an error?

More I think about it, the more I am convinced that essentially any of the is_* methods should not error out when trivially incorrect. They are checking properties, not performing operations.

Actually, does anyone know what the other Ma's do?

At least I found

  • maple... but they do not specify what happens for different matrix sizes.

Maple returns false. I couldn't find anything on Mathematica either.

fchapoton commented 8 years ago
comment:17

I would really like to get this ticket in, guys. Please help. Would you mind giving a positive review ?

There will be a conflict with an existing ticket tackling "... --> .....:" in the matrix folder.. And there may also be a conflict with some xrange-cleaning ticket. :(

fchapoton commented 8 years ago
comment:18

The reference for the algorithms used by m*gma is

Allan Steel.
A New Algorithm for the Computation of Canonical Forms of Matrices over Fields.
J. Symbolic Comp., 24(3):409--432, 1997. 
tscrim commented 8 years ago
comment:19

I strongly believe we should have is_similar not raise an error when comparing mismatched sizes or non-square matrices.

fchapoton commented 8 years ago
comment:20

I must say that I am rather on Vincent's side here (raising appropriate errors for nonsensical input)

But I really think that whatever is done, it will be much better than keeping the current state of affairs. Please guys, let us move forward!

tscrim commented 8 years ago
comment:21

I think we should bring it to a discussion/vote on sage-devel.

fchapoton commented 8 years ago
comment:22

I have started a vote on sage-devel.

fchapoton commented 7 years ago
comment:23

If I count correctly, the result of the vote is

AA = 6
BB = 3
?A = 1

so I am going to go back to raising errors in both cases.

I somebody wants to remove the "is_similar" method for non-square matrices, it should be in another ticket.

fchapoton commented 7 years ago

Changed branch from public/18505 to u/chapoton/18505

fchapoton commented 7 years ago

Changed commit from e12f614 to 4740a42

fchapoton commented 7 years ago

New commits:

914fb4fMerge branch 'public/18505' in 7.5.b1
4740a42trac 18505 back to raising errors
fchapoton commented 7 years ago
comment:25

bot is green, please review

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 7 years ago
comment:26

I've looked over all the accumulated changes, and really like what I see. Thanks for all the clean-up and improvements. I'm sorry my Sage developent skills are so rusty that I can't be more help.

1d7ec08f-60ae-4512-91a6-8324c06eab9f commented 7 years ago
comment:27

I was going to put this on the sage-devel vote, but it looks like the topic got locked.

Thanks to Vincent, Travis and Frédéric for their careful work on this (and I'm glad to see Travis defending use by students who don't know much math or much Python). I'm sorry to be so late in picking up on this one and missing the vote.

I have fond memories of writing .zigzag_form() as a precursor of rational canonical form (is that the best method name ever?). So I'm really glad to see the similarity check becoming fully capable, all the discussion about user interaction, and a few speed-ups for zig-zag form.

fchapoton commented 7 years ago
comment:28

Vincent, would you give a positive review ?

fchapoton commented 7 years ago
comment:29

please, somebody ? this is ready and useful !

videlec commented 7 years ago
comment:30

Why did this change

+        A matrix over the integers will be promoted to a
+        matrix over the rationals, but may need to be manually
+        promoted to the algebraic numbers.  ::
+
             sage: A = matrix(ZZ, 2, 2, range(4))
             sage: B = matrix(QQbar, 2, 2, range(4))
             sage: A.is_similar(B)
+            Traceback (most recent call last):
+            ...
+            TypeError: matrices need to have entries with identical fraction fields,
+            not Rational Field and Algebraic Field
+            sage: A.change_ring(QQbar).is_similar(B)
             True

There is a natural coercion here.

videlec commented 7 years ago
comment:31
sage: cm = get_coercion_model()
sage: m1 = matrix(ZZ, 2, range(4))
sage: m2 = matrix(QQbar, 2, range(4))
sage: mm1, mm2 = cm.canonical_coercion(m1, m2)
sage: mm1.parent()
Full MatrixSpace of 2 by 2 dense matrices over Algebraic Field
sage: mm2.parent()
Full MatrixSpace of 2 by 2 dense matrices over Algebraic Field
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 4740a42 to bfedf0e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

de86aefMerge branch 'u/chapoton/18505' in 7.5.b3
bfedf0etrac 18505 trying to introduce coercion in the game
fchapoton commented 7 years ago
comment:33

Here is a commit trying to involve coercion if possible. Not tested yet.

videlec commented 7 years ago
comment:34

Looks good. But you can avoid complications with

if parent(A) != parent(B):
    coercion

at the very begining. But we would loose specific error message for different sizes.

tscrim commented 7 years ago
comment:35

Since MatrixSpace is a subclass of UniqueRepresentation, you can do parent(A) is parent(B). Although, because this is Cython, I think you can use have_same_parent from element.pxd.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

86a80c2trac 18505 using have_same_parent
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from bfedf0e to 86a80c2

fchapoton commented 7 years ago
comment:37

I am not fond of the idea of starting by comparing parents. Better keep the useful error messages.

Here is a new version using have_same_parent, but still not tested, and 2 doctests are not filled with the results

tscrim commented 7 years ago
comment:38

There is also the @coerce_binop decorator that hides that logic. Although because this is Cython, I'm not sure this would necessarily work as I forget how it can handle decorators...

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

6b168e4trac 18505 fixing details