Closed mezzarobba closed 5 years ago
Do you plan to work on this? If yes, fill in your name as reviewer. If no, let me know.
Description changed:
---
+++
@@ -1 +1 @@
-As discussed in the sage-devel thread starting with `<o21nte$6jp$1@blaine.gmane.org>`, `Element.__richcmp__()` should not fall back on comparing by type/id.
+As discussed in the sage-devel thread starting with `<o21nte$6jp$1@blaine.gmane.org>`, `coercion_model.richcmp()` should not fall back on comparing by type/id.
Description changed:
---
+++
@@ -1 +1 @@
-As discussed in the sage-devel thread starting with `<o21nte$6jp$1@blaine.gmane.org>`, `coercion_model.richcmp()` should not fall back on comparing by type/id.
+`coercion_model.richcmp()` should not fall back on comparing by type/id.
Replying to @jdemeyer:
Do you plan to work on this? If yes, fill in your name as reviewer.
as author? owner?
If no, let me know.
I've started doing experiments, but I'll most likely need help from people who understand the parts of Sage that rely on the current behavior. At the very least I'll try to push a first draft in the next few days.
My main concern right now is that sage.graphs
heavily relies on sorting lists of vertices, while various other parts of Sage create graphs with vertices that mix Elements with other Python objects.
An option might be to try again with key=id
when sort()
fails, but (with no global understanding of the code, at least) there's a major risk of getting inconsistent orders with different subsets of vertices.
Another idea would be to do introduce utilities like
# universal_cmp.pyx
cpdef lt(x, y):
try:
return x < y
except TypeError:
if type(x) is type(y):
return id(x) < id(y)
else:
return type(x) < type(y)
...
cdef class key(object):
def __init__(self):
self.value = value
def __lt__(self, other):
return lt(self.value, other.value)
...
and then use universal_cmp.lt()
and sort(key=universal_cmp.key)
, but this won't work either when one of the types involved only has a partial order, since this order will not be consistent with id
.
Yet another variant would be to pass a comparison function (or, equivalently but probably better in view of the transition to python3, a key
as above) that wraps cmp()
and catches TypeError
s. Not really ideal either, but at least it wouldn't introduce any regression(?), so perhaps that would be enough for this ticket...
Any advice?
Replying to @mezzarobba:
Replying to @jdemeyer:
Do you plan to work on this? If yes, fill in your name as reviewer.
as author? owner?
Sorry, I meant as author.
By the way, why did you remove the reference to the sage-devel thread?
Author: Marc Mezzarobba
Replying to @mezzarobba:
By the way, why did you remove the reference to the sage-devel thread?
Because I didn't understand how to interpret it. A hyperlink would be better.
Replying to @mezzarobba:
My main concern right now is that
sage.graphs
heavily relies on sorting lists of vertices
Why does it rely on sorting vertices?
Replying to @jdemeyer:
Replying to @mezzarobba:
By the way, why did you remove the reference to the sage-devel thread?
Because I didn't understand how to interpret it. A hyperlink would be better.
I personally prefer Message-Ids, as they don't depend on a particular archive...
Replying to @jdemeyer:
Replying to @mezzarobba:
My main concern right now is that
sage.graphs
heavily relies on sorting lists of verticesWhy does it rely on sorting vertices?
In sparse_graph
and generic_graph
, mostly. Some of the occurrences look like they are there for algorithmic purposes, others to make the output more readable, some both at the same time... I didn't look closely enough to say more.
Description changed:
---
+++
@@ -1 +1,2 @@
+As discussed in the sage-devel thread starting with `<o21nte$6jp$1@blaine.gmane.org>` (https://groups.google.com/d/msg/sage-devel/YVFdxPH6avk/4OZUmzLHBgAJ),
`coercion_model.richcmp()` should not fall back on comparing by type/id.
Replying to @mezzarobba:
In
sparse_graph
andgeneric_graph
, mostly. Some of the occurrences look like they are there for algorithmic purposes, others to make the output more readable, some both at the same time... I didn't look closely enough to say more.
It's also important to note the difference between old-style cmp()
and new-style rich comparisons. Ideally, this branch would only affect the latter.
Replying to @jdemeyer:
It's also important to note the difference between old-style
cmp()
and new-style rich comparisons. Ideally, this branch would only affect the latter.
I'm not sure I follow you: cmp()
can end up calling Element.__richcmp__()
, and that's perhaps the main reason why getting rid of the comparison by id there is not trivial.
Replying to @mezzarobba:
My main concern right now is that
sage.graphs
heavily relies on sorting lists of vertices, while various other parts of Sage create graphs with vertices that mix Elements with other Python objects.An option might be to try again with
key=id
whensort()
fails, but (with no global understanding of the code, at least) there's a major risk of getting inconsistent orders with different subsets of vertices.
If you have no information on the objects you're sorting, I don't think there is anything that can save you. If your objects are guaranteed to be hashable you could try sorting on the hash, but that will leave your results undefined when hashes clash. In any case, if you rely on "first try what the object implements and then use a fallback if that fails", you're doomed.
In fact, if you allow your vertices to be labelled with arbitrary objects, you're already losing transitivity of equality in sage, via the standard
sage: a=GF(3)(1); b =31; c=GF(5)(1)
sage: a==b,b==c,a==c
(True, True, False)
It seems that labels get mangled anyway, though, so I think them not being sorted properly is only a minor concern by comparison (sorry):
sage: a=GF(3)(1); b =31; c=GF(5)(1)
sage: L=[[a,b,c],[a,c,b],[b,a,c],[b,c,a],[c,a,b],[c,b,a]]
sage: for l in L:
....: G=Graph()
....: for i in l:
....: G.add_vertex(i)
....: print [parent(j) for j in G.vertices()]
....:
[<type 'int'>, Integer Ring]
[<type 'int'>, Integer Ring]
[<type 'int'>, Integer Ring]
[<type 'int'>, Integer Ring]
[<type 'int'>, Integer Ring]
[<type 'int'>, Integer Ring]
It seems to me the only sane approaches are to not rely on sorting (in python it seems being hashable is more ubiquitous than being sortable, so fast lookup is likely better accomplished via hash functions), or to declare that the labels are sortable (making failure a user error).
This stuff causes real problems: since the old python interface for __cmp__
methods required one to provide non-errors for "<" and ">" testing if one wants to implement "==", sage is riddled with apparent implementations of orderings that are in actuality inconsistent. This has cause problems in at least one case, where labels in published tables of elliptic curves were based on non-sensical but assumed-consistent ordering of number field elements:
https://groups.google.com/forum/#!topic/sage-nt/aP5LP09sADY
This has convinced me that the only sane approach is to deprecate "cmp" style interfaces as quickly as possible and instead rely on "richcmp", with errors for ordering if not obviously correct.
Here is a first attempt. All (standard, long) tests should pass unless I made a last-minute mistake. See the commit messages for some comments. I'm not 100% happy with the changes, but I don't think I can do much better in a reasonable amount of time, feel free to commit improvements to the existing branch!
Last 10 new commits:
d45cd53 | CGraphBackend: fix a segfault when given a non-existing vertex |
6e0e1b2 | Fix use of comparison in simplicial_complex |
2925d0f | Bug in Poset.is_slender(?) |
2b0bcc9 | Some benign doctest failures related to #22029 |
5f36421 | Ad-hoc comparison functions for graph vertices |
6d728fa | {c,sparse}_graph: systematically turn integer-like vertices into ints |
fcda2c2 | graph_plot: sort by id |
127a11d | remove or fix some meaningless doctests for cmp() |
29fe1f8 | misc fixes after changes to coercion_model.richcmp |
749dfbe | misc fixes after changes to coercion_model.richcmp |
Commit: 749dfbe
I am a strong -1 on the removal of the !=
tests and the changes of foo != bar
to (the fugly) not (foo == bar)
.
The failures you have in rigged_configurations.py
is because you moved the definition of case_S
. You have also made the cases in the code more difficult to follow. Please revert those changes.
This is bad:
# (Note that, when y is not a Sage Element, we may have ended up here
# from a call to cmp(). In this case, and in this case only, it may be
# better to emulate what Python does and compare by type/id. But we
# have no way(?) to distinguish this situation from the "normal" case
# of a comparison operator.)
Replying to @tscrim:
I am a strong -1 on the removal of the
!=
tests and the changes offoo != bar
to (the fugly)not (foo == bar)
.
Ugly, I agree, but safer in generic code (because of cases where equality is not decidable, like expressions, or where not enough information is available, like with intervals)... And code that might end up comparing objects of different type is generic code.
The failures you have in
rigged_configurations.py
is because you moved the definition ofcase_S
. You have also made the cases in the code more difficult to follow. Please revert those changes.
I might do it, but I'd like first to understand in detail the alternative you are suggesting, and to see some evidence of a consensus that it is a better option.
In any case, please feel free to commit improvements yourself, and I'll do my best to review them.
Replying to @jdemeyer:
This is bad:
# (Note that, when y is not a Sage Element, we may have ended up here # from a call to cmp(). In this case, and in this case only, it may be # better to emulate what Python does and compare by type/id. But we # have no way(?) to distinguish this situation from the "normal" case # of a comparison operator.)
What do you mean? If I understand right, this is an issue that will solve itself if sage ever switches to Python3...
Replying to @mezzarobba:
If I understand right, this is an issue that will solve itself if sage ever switches to Python3...
Given that we currently still have Python 2, that's irrelevant.
Replying to @jdemeyer:
Replying to @mezzarobba:
If I understand right, this is an issue that will solve itself if sage ever switches to Python3...
Given that we currently still have Python 2, that's irrelevant.
Sorry, I fear I still don't understand your comment. The problem with cmp()
was the main thing I asked about in the sage-devel post that led to this ticket, and you basically replied that it was okay from your point of view to remove the fallback code from richcmp
. Now it looks to me like you are saying the contrary.
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
fb9beb5 | misc fixes after changes to coercion_model.richcmp |
Replying to @tscrim:
The failures you have in
rigged_configurations.py
is because you moved the definition ofcase_S
. You have also made the cases in the code more difficult to follow.
Fixed, sorry for the stupid mistake. That's all I did for now. It wouldn't be hard to revert the other changes related to !=
, but I'm waiting for answers to the follow-up question I just asked on sage-devel.
Branch pushed to git repo; I updated commit sha1. New commits:
6e69645 | Elt richcmp: return NotImplemented for non-Elements... |
Another take, now returning NotImplemented
more often. Due to Jeroen's doubts, I left out the idea of treating SageObject
s in a special way. I do however handle types such as int
and float
like Element
, since this is what the coercion framework usually does.
This should make it possible to revert some of my other changes. I haven't done it yet.
Replying to @mezzarobba:
I do however handle types such as
int
andfloat
likeElement
since this is what the coercion framework usually does.
Yes, the coercion framework tries to coerce those types to a Sage Element
. However, it doesn't do anything special in the case that such a coercion fails. So I don't see why you should treat the failed-coercion-with-int
-or-float
case differently. Just return NotImplemented
in that case.
Replying to @jdemeyer:
Yes, the coercion framework tries to coerce those types to a Sage
Element
. However, it doesn't do anything special in the case that such a coercion fails.
But the net effect is that arithmetic that mixes Element
s and objects of these types will succeed or fail (with an error coming from the coercion framework) depending whether there is a common parent or not.
So I don't see why you should treat the failed-coercion-with-
int
-or-float
case differently. Just returnNotImplemented
in that case.
Do you really want things like
sage: GF(2)(1) > float(0.)
True
sage: RBF(0) < float(-1.)
True
sage: SymmetricGroup(2)(()) != int(1)
True
even though it takes exactly one LOC to prevent them and analogous code with Sage Element
s raises an error (so that, in particular, the behavior will be different for preparsed and pure Python code)?
Replying to @mezzarobba:
Do you really want things like
sage: GF(2)(1) > float(0.) True sage: RBF(0) < float(-1.) True sage: SymmetricGroup(2)(()) != int(1) True
In Python 2? Yes, I want that.
In Python 3? No, I do not want that.
In other words: just follow the common Python convention which specifies that a rich comparison (not __cmp__
) should return NotImplemented
when it cannot decide a comparison.
The fact that we directly raise a TypeError
for Sage Elements is merely a shortcut to get the correct behaviour and to get a nicer error message referring to the parents.
Replying to @jdemeyer:
In other words: just follow the common Python convention which specifies that a rich comparison (not
__cmp__
) should returnNotImplemented
when it cannot decide a comparison.
I tend to see the situation we are talking about not as one where it can't decide, but as one where it can decide that the comparison doesn't make sense. And not to find “just follow the Python[2] convention” such a strong argument when Sage already breaks it in much worse ways...
Replying to @mezzarobba:
And not to find “just follow the Python[2] convention” such a strong argument when Sage already breaks it in much worse ways...
How is it relevant whether or not Sage breaks certain Python conventions in much worse ways?
If Python specifies how to do arithmetic/comparisons, why not follow that?
Replying to @jdemeyer:
How is it relevant whether or not Sage breaks certain Python conventions in much worse ways?
When you start deviating from the general conventions, “your” objects already don't interact with the rest of the system as other developers will expect, and your job IMHO is to minimize the damage.
Now, there is no doubt that you have orders of magnitude more experience of Sage development than I'll likely ever do, so I guess I'll follow your advice even without understanding it if that's really the only way. But I'd really prefer (in general: I'm bringing this up because it is not the first time I have trouble understanding your comments) if you could explain why you think a certain technical choice is better instead of just stating that it is.
Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:
af1a94a | Fix use of comparison in simplicial_complex |
744942c | Bug in Poset.is_slender(?) |
ea1910e | Some benign doctest failures related to #22029 |
c87d7e3 | Ad-hoc comparison functions for graph vertices |
107ad68 | {c,sparse}_graph: systematically turn integer-like vertices into ints |
fd03cd8 | graph_plot: sort by id |
69cc074 | remove or fix some meaningless doctests for cmp() |
b33f2eb | Elt richcmp: return NotImplemented for non-Elements... |
87ca195 | misc fixes after recent changes to coercion_model.richcmp() |
a5a1e2b | Jeroen doesn't like this |
New version of the complete patch series. Tested without the last commit, not fully with it.
I probably won't have much time to work on this in the coming 6-8 weeks, please feel free to take over if you have improvements to make.
The “known bug” is now #22071.
Replying to @tscrim:
I am a strong -1 on the removal of the
!=
tests and the changes offoo != bar
to (the fugly)not (foo == bar)
.
Does the new version (that returns NotImplemented
in more cases, so ill-typed comparisons by !=
will typically succeed when the other member is not an Element
) address your concerns?
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
Replying to @mezzarobba:
New version of the complete patch series. Tested without the last commit, not fully with it.
Turns out there where test failures with that commit. Since it is still not clear to me that it is really what we want, I undid it for now.
Changed keywords from none to cmp
Reviewer: Julian Rüth
I also have trouble seeing why what jdemeyer proposes would be better. For me the proposed changes look fine.
It is plausible that the ad-hoc comparison functions for graph vertices can be simplified (or even removed) thanks to the fact that the general comparison function now returns NotImplemented
for non-Elements
. I haven't had time to check yet.
Also, there are apparently test failures with sage-2.6...
I'll try to work on these issues in 1-2 weeks unless someone beats me to it.
As discussed in the sage-devel thread starting with
<o21nte$6jp$1@blaine.gmane.org>
(https://groups.google.com/d/msg/sage-devel/YVFdxPH6avk/4OZUmzLHBgAJ),coercion_model.richcmp()
should not fall back on comparing by type/id.This branch implements comparisons for
Element
the same way as in Python 3: aTypeError
is raised for uncomparable objects (instead of comparing byid
).Dependencies: #22297, #22344, #22346, #22369, #22370, #22371, #22372, #22373, #22376, #22382, #24968, #26917, #26931, #26933, #26934, #26935, #26936, #26937, #26938, #26947, #27003, #27009, #27010, #27026, #27027, #27029, #27123, #27241
Depends on #27241
CC: @fchapoton
Component: coercion
Keywords: richcmp
Author: Marc Mezzarobba, Jeroen Demeyer
Branch/Commit:
e311ab1
Reviewer: Julian Rüth, Marc Mezzarobba
Issue created by migration from https://trac.sagemath.org/ticket/22029