sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

py3: fixes for sage.schemes.hyperelliptic_curves #25946

Closed embray closed 6 years ago

embray commented 6 years ago

The main thing needed here was a __hash__. Or removing __eq__ and __ne__

Part of #24551

CC: @bhutz @JohnCremona @roed314

Component: python3

Keywords: sagedays@icerm

Author: Erik Bray

Branch/Commit: f8687f7

Reviewer: Travis Scrimshaw, David Roe

Issue created by migration from https://trac.sagemath.org/ticket/25946

tscrim commented 6 years ago

Reviewer: Travis Scrimshaw

tscrim commented 6 years ago

Changed keywords from none to sagedays@icerm

tscrim commented 6 years ago
comment:2

You have too much data in your __hash__:

sage: P.<x> = QQ[]
sage: f0 = 4*x^5 - 30*x^3 + 45*x - 22
sage: C0 = HyperellipticCurve(f0)
sage: P.<x> = RR[]
sage: f1 = 4*x^5 - 30*x^3 + 45*x - 22
sage: C1 = HyperellipticCurve(f1)
sage: C1 == C0
True
sage: hash(C1) == hash(C0)
False

So you definitely have to remove _PP from the hash, and I would probably remove both the __class__ as subclasses are not used in the __eq__ comparison.

embray commented 6 years ago
comment:3

I'm not 100% sure about that. I'd need to double-check what the intention is here. On Python 2 it was inheriting the flimsy default __hash__ from CategoryObject that just hashes its __repr__. So for this case I was including essentially the same data in the hash that would be needed to differentiate the reprs of two of these objects.

But maybe that's not necessary. It just wasn't clear what the intent was (if any) due to the default hash...

tscrim commented 6 years ago
comment:5

I completely agree the repr hash is bad. I think the data used for equality should be what is used for the __hash__, not more. From what you're saying, it seems like we are actually fixing a bug if we only hash self._hyperelliptic_polynomials.

My understanding behind the default hash is that every object has a repr and generally two objects that compare equal have equal repr output.

I don't use/know this code, so we probably have to ask someone who understands it better to confirm if you're worried enough.

embray commented 6 years ago
comment:6

I could certainly try a simpler repr to start with and see what happens. If that works, then I agree with you we should just use what __eq__ compares.

tscrim commented 6 years ago
comment:7

Sounds good. Thanks.

fchapoton commented 6 years ago
comment:8

any progress here ?

embray commented 6 years ago
comment:9

I mean, clearly not. I'd prefer if someone who knew this code were to chime in, but otherwise I'll get to it if/when I feel like taking the time to understand this code better.

embray commented 6 years ago
comment:10

To be clear, I also agree with everything Travis wrote on this ticket; I'm just not in a rush to act on it because I don't feel qualified to make that judgment call w.r.t. this code.

tscrim commented 6 years ago
comment:11

Ben, do you know about this code or know who we should cc?

bhutz commented 6 years ago
comment:12

I'm not familiar with this code. John may know.

JohnCremona commented 6 years ago
comment:13

I don't know or use the code (except perhaps occasionally) and so I do not really understand the issue here. I don't think that I have ever written such a has function so do not know what properties it is supposed to have.

What exactly was the problem which this patch is intended to fix?

alexjbest commented 6 years ago
comment:14

I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named x even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other? For contrast

sage: P.<x,y>= RR[]
sage: C0 = Curve(y - x^2  - 1)
sage: C0
Affine Plane Curve over Real Field with 53 bits of precision defined by -x^2 + y - 1.00000000000000
sage: P.<x,y>= QQ[]
sage: C1 = Curve(y - x^2  - 1)
sage: C1
Affine Plane Curve over Rational Field defined by -x^2 + y - 1
sage: C0 == C1
False
embray commented 6 years ago
comment:15

Replying to @alexjbest:

I'm really confused about whether or not we should call these hyperelliptic curves equal in the first place. The current code returns true because the polynomials are equal which is because both variables are named x even though they are over different fields, are two curves over different fields really _equal_, even if one is just a base change of the other?

I had the same doubt before. There's some inconsistency throughout Sage about when some objects defined over different fields are considered equal under ==. In some cases it's quite deliberate, in other cases it's not clear until and unless the original author chimes in. If the author isn't even sure I would lean against calling them ==, much less having the same hash.

JohnCremona commented 6 years ago
comment:16

I agree with Alex: to be equal the fields of definition should equal and the defining polynomials identical. This is the case with elliptic curves:

sage: E = EllipticCurve([0,1])
sage: E
Elliptic Curve defined by y^2 = x^3 + 1 over Rational Field
sage: K.<i> = NumberField(x^2+1)
sage: EK = E.change_ring(K)
sage: EK
Elliptic Curve defined by y^2 = x^3 + 1 over Number Field in i with defining polynomial x^2 + 1
sage: E == EK
False
sage: hash(E)
8741953973726
sage: hash(EK)
8741951786400

I don't know what this hash function actually does: it calls hash_by_id, whatever that is. I also don't know how to find out what function is being called by E==EK either!

tscrim commented 6 years ago
comment:17

EllipticCurve is a UniqueFactory, so the input (i.e., in part, the field) uniquely identifies the curve. So the hash and == is obtained by using id as there is a unique such object in memory at a given time. Changing the hyperelliptic to suppose the UniqueRepresentation-type behavior would be a very invasive and complicated surgery I think.

So the short-term fix IMO would be deciding an appropriate notion of __eq__, and then changing the __hash__ to match. From what John is saying, we should also introduce a check that the defining fields are equal.

Also, from git blame, "Nick Alexander" and "tornaria" comes up a lot and David Kohel is listed in the copyright.

fchapoton commented 6 years ago

Changed commit from cb04ceb to 78ae922

fchapoton commented 6 years ago
comment:18

I propose a solution : simply remove __eq__, __ne__ and __hash__ here.

Then they are all provided by the general setting of projective subschemes, which seems to be a good idea. This will compare ambient spaces and defining ideals, which amount to compare base field and polynomials.


New commits:

2726198Merge branch 'u/embray/python3/sage-schemes-hyperelliptic_curves/misc' of ssh://trac.sagemath.org:22/sage into 8.4.b0
78ae922py3: comparison and hash cleanup for hyperellliptic curves
fchapoton commented 6 years ago

Changed branch from u/embray/python3/sage-schemes-hyperelliptic_curves/misc to public/25946

JohnCremona commented 6 years ago
comment:19

That sounds very sensible to me -- I do not know why the person who implemented these special functions thought it was necessary to override those of the parent class.

As far as I am concerned this is good to merge assuming that tests all pass.

fchapoton commented 6 years ago

Description changed:

--- 
+++ 
@@ -1 +1,4 @@
-The main thing need here was a `__hash__`.
+The main thing needed here was a `__hash__`.
+Or removing `__eq__` and `__ne__`
+
+Part of #24551
fchapoton commented 6 years ago
comment:21

failing doctests in hyperelliptic_padic_field.py (sigh)

embray commented 6 years ago
comment:22

This seems like a reasonable approach, but it seems that you exposed a poor assumption that was made somewhere else. I don't know what a sage.rings.padics.padic_ZZ_pX_CR_element.pAdicZZpXCRElement is, or why it isn't hashable. Perhaps it should be?

If it definitely shouldn't be hashable (e.g. it is not immutable) then maybe the problem is in MPolynomialIdeal.__richcmp__'s assumption that the generators of an ideal are definitely hashable (and if not, it should do something else to compare gens).

fchapoton commented 6 years ago
comment:23

There is a _cache_key method on these objects inside src/sage/rings/padics/padic_ZZ_pX_CR_element.pyx

But its doc claims that there is no reasonable hash..

JohnCremona commented 6 years ago
comment:24

Adding David Roe in CC since he may be able to throw light on the p-adic ring issue. Note that p-adic rings (as with the reals) are not exact, so comparison between elements is not so easy.

embray commented 6 years ago
comment:25

Incidentally, MPolynomialIdeal.__richcmp__ was last touched by #23920. The idea to use sets to compare gens was from Travis: #23920 comment:9

Perhaps there should be a try/except around this and attempt direct comparison if comparing by sets raises a TypeError. I'm not sure what the impact is of computing a Groebner basis, or if it can be avoided in some other way as well.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 78ae922 to 2431727

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2431727adding the hash
fchapoton commented 6 years ago
comment:27

adding hash from the existing _cache_key seems to fix the failing doctests.. But is this a correct thing to do ?

embray commented 6 years ago
comment:28

Replying to @fchapoton:

adding hash from the existing _cache_key seems to fix the failing doctests.. But is this a correct thing to do ?

No, I don't believe so, and the I think the explanation in the _cache_key docs is fairly clear why. Instead: #25946 comment:25

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

d5573efpy3: add a `__hash__` for HyperellipticCurve_generic
f29c001py3: need explicit cast to int for range
3771812py3: comparison and hash cleanup for hyperellliptic curves
9f9399ctiny change in the comparison of ideals inside polynomial rings
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 2431727 to 9f9399c

fchapoton commented 6 years ago
comment:30

ok. So here is my next proposal : when comparing ideals generated by a single polynomial, just compare this generator (no need to wrap by set).

embray commented 6 years ago
comment:31

True. Is that case common enough to merit a special case? Are there cases of ideals of rings over these types of p-adics whose ideal would be generated by more than one polynomials (and hence would still crash on the existing code)?

What I'm saying is just, why not:

try:
    same_gens = set(self.gens()) == set(other.gens())
except TypeError:
    same_gens = False

if same_gens:
    # Do whatever we do currently
else:
    # Do whatever is necessary to compare the gens (construct a GB, etc).
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ecc4b36change again the comparison of ideals
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 9f9399c to ecc4b36

JohnCremona commented 6 years ago
comment:33

Replying to @embray:

True. Is that case common enough to merit a special case? Are there cases of ideals of rings over these types of p-adics whose ideal would be generated by more than one polynomials (and hence would still crash on the existing code)?

Certainly: in ZZ_p[X] the ideal generated by p and X. But the number of gens will always be small, so why not sort the gens, however many there are?

This is an example of a common problem in Sage and other systems. If I define two ideals (or other structures) in different ways and ask whether they are equal then the test may be very expensive. But here we just test whether they have been presented in the same way (or with trivial changes such as changing the order of the generators).

What I'm saying is just, why not:

try:
    same_gens = set(self.gens()) == set(other.gens())
except TypeError:
    same_gens = False

if same_gens:
    # Do whatever we do currently
else:
    # Do whatever is necessary to compare the gens (construct a GB, etc).

New commits:

ecc4b36change again the comparison of ideals
fchapoton commented 6 years ago
comment:34

Here is another tentative : first compare without set() then compare with set()

embray commented 6 years ago
comment:35

Replying to @fchapoton:

Here is another tentative : first compare without set() then compare with set()

That could still crash. I don't know the math well enough to construct an interestingexample case, but say you have two copies of the same ideal, but their generators just happen to be in a different order, as in:

sage: R.<x,y> = QQ[]
sage: R.ideal(x, y).gens()
[x, y]
sage: R.ideal(y, x).gens()

This is the sort of case, as John wrote, that the set() call is intended for in the first place. But if the elements x and y are not hashable, then the first test will still fail, and then the second test will be evaluated and will crash.

IMO this comparison of gens() is just a special case shortcut which is fine as-is, but it should account for the possibility that some elements just aren't hashable, that's all.

JohnCremona commented 6 years ago
comment:36

Does hashable imply not sortable? As I said, the lists of gens will be short normally, so something like testing sorted(s_gens)==sorted(o_gens) might work when the set() constructor does not? There will still be silly trivial cases such as first giving {x,y} as gens then {x,x,y} and for sure someone will then complain. But even as it is, we are not returning equality of the ideals (x,y) and (x+y,y). We are never going to catch all trivial cases (and different users' idea of a trivial case will differ) so just comparing the list of gens with no processing at all is safest and most easily explained.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from ecc4b36 to d47fbb3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d47fbb3wrap with try
fchapoton commented 6 years ago
comment:38

ok, I have wrapped with "try / except".

embray commented 6 years ago
comment:39

Replying to @JohnCremona:

Does hashable imply not sortable? As I said, the lists of gens will be short normally, so something like testing sorted(s_gens)==sorted(o_gens) might work when the set() constructor does not? There will still be silly trivial cases such as first giving {x,y} as gens then {x,x,y} and for sure someone will then complain. But even as it is, we are not returning equality of the ideals (x,y) and (x+y,y). We are never going to catch all trivial cases (and different users' idea of a trivial case will differ) so just comparing the list of gens with no processing at all is safest and most easily explained.

I don't think sorting really helps here, because as I found out over in #25948, some elements can be "sorted" but the ordering is meaningless and unpredictable.

embray commented 6 years ago
comment:40

Replying to @sagetrac-git:

Branch pushed to git repo; I updated commit sha1. New commits:

d47fbb3wrap with try

You could still get rid of (s_gens == o_gens). I don't think it's that useful.

fchapoton commented 6 years ago
comment:41

Replying to @embray:

Replying to @sagetrac-git:

Branch pushed to git repo; I updated commit sha1. New commits:

d47fbb3 wrap with try

You could still get rid of (s_gens == o_gens). I don't think it's that useful.

Yes it is. It will handle the precise case that we are dealing with in this ticket, hyperelliptic curves over p-adics.

embray commented 6 years ago
comment:42

Ok, but I think as written you don't need the same_gens variable. I only wrote it that way in my pseudo-code because I wasn't directly looking at the real code at the time, and didn't consider using a try/except/else. I don't think we need to worry about rich_to_bool raising a TypeError.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from d47fbb3 to b129fc9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b129fc9fix