sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.23k stars 429 forks source link

Inexact zeros #35319

Open videlec opened 1 year ago

videlec commented 1 year ago

This issue aims to discuss the problem of comparison, and in particular comparison to zero, of ambient structures that provide exact computations which involve inexact objects or objects with multiple representations. It covers

One of the goal is to come up with conventions that allows to manipulate accurately algebraic constructions on them such as (sparse) polynomials and (sparse) matrices. This is currently completely broken

sage: R.<t> = PowerSeriesRing(QQ)
sage: m = matrix(2, {(0,0): O(t), (1,1): O(t)}, sparse=True)
sage: m * m
[0 0]
[0 0]
sage: R['x,y'](O(t))
0

Where the result here should have been [[O(t^2) 0] [0 O(t^2)]] and O(t). Note that generic dense matrices are better behaved

sage: m = matrix(2, {(0,0): O(t), (1,1): O(t)}, sparse=False)
sage: m * m
[O(t^2)      0]
[     0 O(t^2)]

In order to solve the above algebraic construction, it is desirable to have finer distinction between different notions of being zero in order to distinguish "exact zero" that we can drop from a sparse data structure and "zero up to some precision" that should be kept. More generally, this issue is about about making specifications for comparisons of elements. For that purpose, it might be convenient to distinguish different notions of equalities

Power series currently use approximate equality while balls use semantic equality.

What we would like to use for sparse polynomials and sparse matrices over power series/p-adic are "semantic" equality. Though it is not clear what is the best option for symbolic expressions. Something in between "syntactic" and "semantic" is probably what should be used so that "trivial" simplification could be made to keep reasonable arithmetic speed.

The main questions that are in need for specifications are

saraedum commented 1 year ago

I have only thought about power series and p-adic numbers. So this only applies to these:

I think that bool(x) should be False iff x == 0 is True.

We have x == y when (simplifying a bit) the numbers are equal when reduced to the lower precision of the two. Imho, we cannot realistically change this without breaking all of the p-adics code. (The only alternative notion I can think of would be x == y if they are indistinguishable. I don't believe this is a good choice here.)

if we decide that bool(inexact_zero) is False in sage, then how do we test (efficiently) and (ideally) Python compatibly that an element is zero?

What do you mean? How do we test that an element is an exact zero? (There might not be one in the ring.) How do we test that the element is indistinguishable from parent.zero()? I don't think we have a standard way to test this. It would probably be good to add this to the the basic element class in SageMath.

I don't understand what you mean with "Python compatibly". We cannot use bool(x). You never know what the author meant in inexact rings if you only have a single notion of zeroness (this is of course also a problem with floating point reals.)

what should be bool(matrix) or bool(polynomial) containing exact and inexact zeros only?

A matrix should be falsy if each of its entries is falsy.

A polynomial should be falsy if the length of its coefficients is 0. (Everything else is just too surprising I think.) When coefficients should be dropped is a complicated issue. I don't know what's the latest thinking about this.

videlec commented 1 year ago

Thanks for your comments.

I have only thought about power series and p-adic numbers. So this only applies to these:

(actually, I don't think there is any problem with ball/intervals. I just put it there as a matter of comparison and to be sure to include them in "rings that handle inexact zeros")

I think that bool(x) should be False iff x == 0 is True.

Could you elaborate a bit on this? I am starting thinking that making them behave differently might actually help a bit.

We have x == y when (simplifying a bit) the numbers are equal when reduced to the lower precision of the two. Imho, we cannot realistically change this without breaking all of the p-adics code. (The only alternative notion I can think of would be x == y if they are indistinguishable. I don't believe this is a good choice here.)

As far as I understand, this is the same behavior as power series

sage: R.<t> = PowerSeriesRing(QQ)
sage: t == O(t)
True

And this is furthermore compatible with the ideal reduction

sage: 42 == Zmod(5)(2)
True

The latter being analogous to O(t^n) seen as image in QQ[t] / <t^n>.

if we decide that bool(inexact_zero) is False in sage, then how do we test (efficiently) and (ideally) Python compatibly that an element is zero?

What do you mean? How do we test that an element is an exact zero? (There might not be one in the ring.) How do we test that the element is indistinguishable from parent.zero()? I don't think we have a standard way to test this. It would probably be good to add this to the the basic element class in SageMath.

I don't understand what you mean with "Python compatibly". We cannot use bool(x). You never know what the author meant in inexact rings if you only have a single notion of zeroness ~(this is of course also a problem with floating point reals.)~

By "Python compatibly" I meant something that would apply to Python types int, float where, as far as I know, bool(x) and x == 0 are the only (equivalent) way to test it.

I would precisely like to use bool(x) for testing exact zeroness (whatever it means).

what should be bool(matrix) or bool(polynomial) containing exact and inexact zeros only?

A matrix should be falsy if each of its entries is falsy.

Which is annoying for sparse matrices that store a list of nonzero coefficients. Here you would like bool(matrix) to be equivalent to has a coefficient.

A polynomial should be falsy if the length of its coefficients is 0. (Everything else is just too surprising I think.) When coefficients should be dropped is a complicated issue. I don't know what's the latest thinking about this.

Then for polynomials with power series coefficients QQ[[t]][x], the element p = O(t) x would be such that bool(p) is True (has a coefficient) but p == 0 is True (equality of all coefficients). Which contradicts your initial sentence.

saraedum commented 1 year ago

Could you elaborate a bit on this?

I would not make x == 0 and not bool(x) different. For me bool(x) means "is True or non-zero or non-empty". I think we should be explicit if we want another notion of zeroness and not hide things in some convention like that.

[sparse matrices]

I don't know. I don't do sparse matrices usually so I am not sure what I would expect.

Then for polynomials with power series coefficients QQ[[t]][x], the element p = O(t) x would be such that bool(p) is True (has a coefficient) but p == 0 is True (equality of all coefficients). Which contradicts your initial sentence.

True. The notions of bool(p) meaning non-empty and bool(p) meaning non-zero are not compatible. Depending on whether a polynomial is a container of coefficients or a element of a ring in your use case, you will be unhappy. Probably using our generic polynomials over such rings just won't work no matter how we tweak this bit. (They currently don't work in my experience.)

roed314 commented 1 year ago

I think my intuition is a bit different from Julian's for polynomials (or matrices) f that consist of only inexact zero coefficients. I think they should have f == 0 and not f both be true. In general, I don't think it's likely that generic algorithms designed for exact rings will just work out of the box with inexact rings, whatever convention you use. But the convention that we're advocating for I think will lead to less surprises: leading coefficient will work for example. This convention also restores the condition that Julian initially mentioned (I think that bool(x) should be False iff x == 0 is True).

I think the right approach in the long term is to implement different algorithms for inexact rings that use inequalities on distance and aim for good numerical stability properties.

mezzarobba commented 1 year ago

It seems to me that it is possible to write generic algorithms that behave properly in the presence of inexact zeros (and other things like that) in a computer algebra system, but one needs at least three different notions of equality:

I really don't see how one could get around distinguishing at least the first three. For example, to decide which coefficients of a polynomial should be printed, what matters is whether they are syntactically zero or not. That is not the same as certifying that the polynomial is mathematically zero (the coefficients may be un-normalized symbolic expressions or algebraic numbers, or they may be intervals), yet one may want to do both in the same code.

Each of the four relations listed above is typically weaker than the previous ones, but not always. For instance, for intervals or NaNs, syntactic equality does not imply semantic equality. (I can't think on the top of my head of a case where one may want objects be semantically equal without being approximately equal. Maybe approximate equality is best thought of as coercion to a different algebraic structure followed by comparison in that structure.) Also, generally speaking:

Now some languages (e.g., Julia) provide standard operators that map pretty well to this typology. Unfortunately, this is not the case of Python. Equality in Python most closely corresponds to syntactic equality (cf. its compatibility requirements with hashing), and Sage tries to use it to give the impression of something closer to semantic equality, which leads to all sorts of problems with comparisons between objects that do not have the same parent. I think the broader question is how close we can get to a model which properly distinguishes between the various kinds of equality starting from the current state of things.

It was under the impression that, except maybe in the p-adics code, there was a weak consensus to (ab)use the distinction between bool() and == 0 and use the former for syntactic equality, the latter for semantic equality. But the present discussion suggests that not everyone agrees.

From the same point of view, having t == O(t) for power series only makes the confusion between syntactic and semantic equality worse, by reusing the same notation again for approximate equality. Now the “inner” notion of equality between elements of a given parent can largely be determined on a case-by-case basis, and I don't mind having parents where t == O(t) as long as it plays well with coercion... At the same time, if you want that, you are really working in ℚ[[t]]/〈t〉, not ℚ[[t]], so why not call it like that?

And to conclude these ramblings: I have of course no solution to these issues. I do think we need to write down what == 0 and bool() are supposed to do, and define as many additional equality test methods (with sensible defaults) as needed to support all use cases. If == does not mean provable equality (except maybe in some “optimistic” parents), then we need another “standard” way of testing mathematical equality. In the long term, it is not completely unimaginable to move Sage to a model which carefully distinguishes between 4+ equality test operators or methods, but that would be a painful transition and probably make things confusing for many casual users...

saraedum commented 1 year ago

I don't think it's likely that generic algorithms designed for exact rings will just work out of the box with inexact rings, whatever convention you use.

This is true for real "algorithms". But there are lots of almost trivial methods that can often be implemented to work for all rings (if we could just be specific about the notion of equality) and that saves lots of boilerplate in say p-adic specific polynomial classes. I find it frustrating that in generic methods I have to protect lots of things in if base_ring.is_exact() where to be honest I am just saying if base_ring is not p-adic because I have not clue what the conventions for other inexact rings might be.

In the long term, it is not completely unimaginable to move Sage to a model which carefully distinguishes between 4+ equality test operators or methods

We'd still keep == in its current form, right? But there would be more specific operators that can be used by generic code. We just keep replacing instances of == whenever they cause trouble in generic code…I don't think it makes things confusing for the casual user. If they're working exact, == is fine. And in an inexact context, they will likely be aware of the problem of equality?

videlec commented 1 year ago

I don't think it's likely that generic algorithms designed for exact rings will just work out of the box with inexact rings, whatever convention you use.

This is true for real "algorithms". But there are lots of almost trivial methods that can often be implemented to work for all rings (if we could just be specific about the notion of equality) and that saves lots of boilerplate in say p-adic specific polynomial classes. I find it frustrating that in generic methods I have to protect lots of things in if base_ring.is_exact() where to be honest I am just saying if base_ring is not p-adic because I have not clue what the conventions for other inexact rings might be.

As an example of a "non true" algorithms, I would like to mention generic multivariate polynomial arithmetic (or sparse implementation of univariate) where you store pairs (exponent, coefficient). This is what #35174 is about. Here you only want to store non-zero coefficients in the sense not "syntactic zero". Just for that purpose, it would be extremely convenient if bool(x) would actually mean "not a syntactic zero". Where I mean syntactic zero for something "that prints as 0". The following is not a syntactic zero

sage: sqrt(2) * sqrt(3) - sqrt(6)
sqrt(3)*sqrt(2) - sqrt(6)

(I hope I understood Marc terminology correctly)

In my proposal, both O(t) and sqrt(3)*sqrt(2) - sqrt(6) would evaluate as True under bool.

roed314 commented 1 year ago

I really don't like having O(t) evaluate as True. While I agree that the built-in Python types don't include any inexact numerical objects, the Python documentation states that "zero of any numeric type" should be false in a boolean context. For things that can compare with zero, this makes the rule bool(x) being the same as x != 0 seem good to me. Moreover, this is what's currently implemented in Sage and changing it is likely to break existing code in subtle ways.

I'd be fine creating another method at the level of elements that could be used instead. For p-adics there's _is_exact_zero:

sage: a = Zp(5)(0,4)
sage: a
O(5^4)
sage: a._is_exact_zero()
False
sage: b = Zp(5)(0)
sage: b._is_exact_zero()
True

Maybe we want to think about a non-underscore name that we can use more broadly (or we could just use an underscore method and promote _is_exact_zero to Element).

videlec commented 1 year ago

Referring to the same Python documentation the default behavior of __bool__ is to test for zero __len__. Hence O(t) x would better evaluate to True. Which let me think again that bool(x) should be False iff x is a syntactic zero. The very same logic make sense with sqrt(6) - sqrt(2) * sqrt(3) which is a non-empty expression tree (which happen to evaluate to zero mathematically).

A global function is_syntactic_zero could do the job. But not a method of Element as it will not handle Python types. It will also be less efficient than the builtin bool.

videlec commented 1 year ago

Thanks to your input I updated the description. Feel free to edit or suggest modifications.

mkoeppe commented 1 year ago

The Pythonic way to test for inexact zeroness is to use math.isclose (see also https://numpy.org/doc/stable/reference/generated/numpy.isclose.html#numpy-isclose). The tolerances are always application-specific. Strong -1 on any attempts to find a magic solution for inexact zeroness.

mkoeppe commented 1 year ago

Note that Expressions already have a method is_trivial_zero with useful semantics: It is simply a fast sufficient test for zeroness.

Attempting to define "syntactic equality" that goes beyond that is likely problematic.

videlec commented 1 year ago

The Pythonic way to test for inexact zeroness is to use math.isclose (see also https://numpy.org/doc/stable/reference/generated/numpy.isclose.html#numpy-isclose). The tolerances are always application-specific. Strong -1 on any attempts to find a magic solution for inexact zeroness.

You probably misunderstood the issue at hand. Floating points are behaving perfectly well : they have a unique zero (though NaN is a bit problematic). One of the main question at hand is that O(t) and 0 (in the ring of power series) are treated the same. And this is dramatic as they do not mean the same thing. Another way to put it is that we are discussing the algebraic side of things (= numbers that carry their error) rather than the numerical side of things (= floating point and hope for the best).

videlec commented 1 year ago

Note that Expressions already have a method is_trivial_zero with useful semantics: It is simply a fast sufficient test for zeroness.

Would you call that a specification ? It is certainly useful for end users and certainly useless when considering polynomials over the symbolic ring that behave in a deterministic way. I rather care about the second point.

Attempting to define "syntactic equality" that goes beyond that is likely problematic.

What do you mean by "that" ? If you refer to the symbolic ring I do not think that any of us seriously care about the semantic of the zeroness in the sage SymbolicRing.

Could you point us to the problematic points rather than saying that it will be problematic ? What is wrong with the different proposals that have emerged from the discussion ?

videlec commented 1 year ago

@mkoeppe I tried to clarified the issue. Tell me if it makes more sense.

I totally agree that symbolic expressions are more delicate than power series/p-adics which are my main concern at the moment. But still, it makes sense to try to have a general framework for all of them.

mkoeppe commented 1 year ago

Thanks for the clarifications.

Taking a closer look at the example in the issue description with the PowerSeriesRing, the issue is that O(t) == 0 evaluates to true. Why would that ever be a good idea?

In particular, it's intransitive

sage: R.<t> = PowerSeriesRing(QQ)
sage: 0 == O(t)
True
sage: O(t) == t
True
sage: 0 == t
False

... and therefore whatever may be intended there should not be expressed using == but using some method call.

mkoeppe commented 1 year ago

I think that bool(x) should be False iff x == 0 is True.

I agree with this; I don't think this can be negotiable.

mkoeppe commented 1 year ago

Could you point us to the problematic points rather than saying that it will be problematic ? What is wrong with the different proposals that have emerged from the discussion ?

The fundamental problem is that syntactic features such as sparsity of matrices and distinguishing "different representations" of an object contradicts the declared mathematical category and leads too easily to inconsistent behavior. An example question – what is the neutral element in the "additive group" of sparse matrices? (Answer: it's a trick question – sparse matrices don't form an additive group.)

If the syntactically refined structure is really useful, then it should be analyzed and defined properly, separately from the original structure, and the original structure should become a quotient. (@mezzarobba expressed a similar point in an example in https://github.com/sagemath/sage/issues/35319#issuecomment-1477930267)

videlec commented 1 year ago

It would indeed make sense in the power series ring to set 0 == O(t) being False. The balls go even further in the sense that x == y if and only if x and y have radius 0 and same center.

What about adopting the same convention for power series?

mezzarobba commented 1 year ago

(@mezzarobba expressed a similar point in an example in #35319 (comment))

Yes, in a sense, but to be clear, I do support the idea of adding better-specified comparison operators to address the issue discussed here.

roed314 commented 1 year ago

I completely disagree that we should require radii to be the same for equality in power series rings. It's absolutely a good idea for O(t) == 0 to be true there. Here are some reasons.

The real field has lots of different precision models, as do p-adics. There's only one version of power series at the moment. I'm completely supportive of implementing additional precision models for power series rings, and if you want to implement a new one that uses equality of balls, go for it. But don't break everyone's existing code with this change.

roed314 commented 1 year ago

For the original issue about how to test for whether to keep entries in a sparse representation (or trim leading zeros), I think a new function or method that handles the different cases is a reasonable solution. If you're worried about speed you could also have a method on the parent like _has_inexact_zeros so that in the common (exact) case you can do one test outside the loop and then use __bool__ in the loop.

videlec commented 1 year ago

Just for the record : PARI does comparison of power series in the sage style but MAGMA does not. From their documentation

Since a given series ring will contain series truncated at arbitrary precision, care has to be taken as to the meaning of equality of two elements. Two series are considered equal iff they are identical (the equality operator eq follows this convention). But we call two series A and B in the same ring weakly equal if and only if their coefficients are the same whenever both are defined, that is, if and only if for every n≤p the coefficients An and Bn are equal, where p is the minimum of the precisions of A and B. Thus, for example, A=3 + x + O(x2) and B=3 + x + 17x2 + O(x4) are the same, but C=3 + x + 17x2 + x3 + O(x4) is different from B. Note that A and C are equal under this definition, and hence weak equality is not transitive!

Their documentation is still unclear about the meaning of "identical".

videlec commented 1 year ago

Let me try to recap.

  1. bool(x) should remain equivalent to x == 0
  2. We need more ways to compare elements x and y
  3. We could either implement these various comparisons with new functions/operators or by adding extra parameters in the parents; possibly both.

So the remaining questions are

mezzarobba commented 1 year ago

@saraedum:

I have only thought about power series and p-adic numbers. So this only applies to these:

I think that bool(x) should be False iff x == 0 is True.

@mkoeppe:

I agree with this; I don't think this can be negotiable.

@videlec:

bool(x) should remain equivalent to x == 0

Are you all talking only about power series, or in general? What do you think of the cases where bool(x) and x == 0 are not equivalent at the moment? For instance:

sage: a = RIF(-1,1)
sage: bool(a)
True
sage: a == 0
False
sage: a != 0
False

(and similarly for CIF, RBF, CBF), or:

sage: b = SR(0)
sage: bool(b)
False
sage: b == b
0 == 0
sage: bool(b == b)
True
mezzarobba commented 1 year ago

should we implement them "operator" style, "parameters in parent" style? or both? or something else?

I would say both. That is, ideally, I think we need “operator style” comparisons for all the variants of equality listed in https://github.com/sagemath/sage/issues/35319#issuecomment-1477930267, but each parent can have some freedom to define what equality means for its elements. How much freedom it can have depends mainly