sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 481 forks source link

Monoids._test_one() incorrect for unhashable one elements #16250

Closed saraedum closed 8 years ago

saraedum commented 10 years ago

In categories.Monoids._test_one() the hash value of a monoid's one is tested.

tester.assertEqual(type(one.__hash__()), int) 
tester.assertEqual(one.__hash__(), one.__hash__())

For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. Eventually, __hash__ will be dropped #11895. In this ticket, the behaviour is changed, so that _test_one() does not check the hash if the element does not inherit from collections.Hashable. However, this does not work for Cython extension classes: ​http://permalink.gmane.org/gmane.comp.python.cython.user/11569. For those, this check has to be skipped explicitly.

The same problem exists for CommutativeAdditiveMonoids._test_zero().

CC: @nthiery

Component: categories

Keywords: hash

Author: Julian Rüth

Branch/Commit: u/saraedum/ticket/16250 @ a40d04f

Issue created by migration from https://trac.sagemath.org/ticket/16250

saraedum commented 10 years ago

Description changed:

--- 
+++ 
@@ -6,4 +6,4 @@

For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. However, since p-adics overwrite __richcmp__ (in padic_generic_element.pyx) they are not hashable, i.e., hash(one) raises a TypeError.

-The same problem exists for AdditiveMonoids._test_zero(). +The same problem exists for CommutativeAdditiveMonoids._test_zero().

saraedum commented 10 years ago

Branch: u/saraedum/ticket/16250

saraedum commented 10 years ago

Author: Julian Rüth

saraedum commented 10 years ago

New commits:

7b35c8aMonoids._test_one() and CommutativeAdditiveMonoid._test_zero() should not check the hash value for unhashable elements
saraedum commented 10 years ago

Commit: 7b35c8a

nthiery commented 10 years ago
comment:6

Hi!

Thanks for shaking the generic tests with one more use case :-)

I agree that it makes sense to disable the test if the object is not hashable. On the other hand, I would much prefer to have an explicit test like:

    isinstance(..., collections.Hashable)

This better expresses the intention and also avoids the risk of false positives in case a TypeError failure would be raised for some other reasons and be silently caught.

Of course the above isinstance test fails if there is a __hash__ method defined. But this can be conveniently fixed by setting __hash__ to None in the element class. In fact this sounds like a natural thing to do anyway: having __hash__ defined and return something, and worst something that could change over time, is nothing but a potential source of issues, isn't it?

With __hash__ set to None, we have as desired:

    sage: import collections
    sage: K.<a> = Qq(9)
    sage: K.zero
    sage: isinstance(K.zero(), collections.Hashable)
    False

Agreed, the right fix (tm) would probably be to remove the infamous SageObject.__hash__ method in the first place. Not all SageObjects are hashable, and anyway computing the hash in term of _repr_ is just a continuous source of bugs. But this fix could take some work since one would need to implement __hash__ properly in a bunch of places; so let's say that this is for a separate ticket.

Cheers, Nicolas

saraedum commented 10 years ago
comment:7

Replying to @nthiery:

I agree that it makes sense to disable the test if the object is not hashable. On the other hand, I would much prefer to have an explicit test like:

    isinstance(..., collections.Hashable)

I thought about this too. I agree that this would be better in theory. The problem with this approach are things like matrices. They define a hash which raises a TypeError if the matrix is mutable.

Agreed, the right fix (tm) would probably be to remove the infamous SageObject.__hash__ method in the first place. Not all SageObjects are hashable, and anyway computing the hash in term of _repr_ is just a continuous source of bugs. But this fix could take some work since one would need to implement __hash__ properly in a bunch of places; so let's say that this is for a separate ticket.

I agree. At first it seems that basing this on _repr_ is exactly what one wants. But the little work you save is just not worth all the trouble it causes.

nthiery commented 10 years ago
comment:8

Replying to @saraedum:

I thought about this too. I agree that this would be better in theory. The problem with this approach are things like matrices. They define a hash which raises a TypeError if the matrix is mutable.

Cheers, Nicolas

saraedum commented 10 years ago
comment:9

Replying to @nthiery:

Replying to @saraedum:

I thought about this too. I agree that this would be better in theory. The problem with this approach are things like matrices. They define a hash which raises a TypeError if the matrix is mutable.

  • The methods one and zero should return immutable objects whenever possible, because more often than not the result is cached. If the result is a matrix this is no issue: it's enough to make it immutable. And then it becomes hashable. Part of the rationale of this test is precisely to check for this!

  • Having a method __hash__ that raises an error is bad advertising in the first place. In some cases it might be simpler to do it this way (e.g. for objects that can be immutable or not). But I would not want to particularly support this.

Right. But what about any type for which equality depends on a p-adic number? (Say, a polynomial with p-adic coefficients.) If we don't want to have special classes for all these, then __hash__ will have to raise a TypeError when computing the hash of its coefficients (see #11895).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

c4f8c62AdditiveMagmas.test_zero() and Magmas.test_one() allow unhashable elements
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 7b35c8a to c4f8c62

nthiery commented 10 years ago
comment:11

Hi Julian,

Replying to @saraedum:

Right. But what about any type for which equality depends on a p-adic number? (Say, a polynomial with p-adic coefficients.) If we don't want to have special classes for all these, then __hash__ will have to raise a TypeError when computing the hash of its coefficients (see #11895).

Interesting example. Well, we are in the situation of class that declares itself as hashable by providing a hash function, and then later refutes this contract by throwing errors. I see this as a bug.

Alright, a minor enough bug that we may well not want to introduce complicated infrastructure to fix it.

However changing test_one and test_zero for this is just hiding this bug, and might hide other more important bugs in the future. I find preferable to explicitly skip _test_one() in the doctests: this will do what we mean: there is a known bug, and it's deemed minor enough to ignore it.

Now, if we would want a real fix we could have polynomial rings check whether there base ring elements can be hashed, and insert __hash__ = None in their custom element class. This could get tricky for a Cython class though. But this states quite explicitly: I am hashable if my base ring elements are.

Cheers, Nicolas

saraedum commented 10 years ago
comment:13

Replying to @nthiery:

Replying to @saraedum: However changing test_one and test_zero for this is just hiding this bug, and might hide other more important bugs in the future. I find preferable to explicitly skip _test_one() in the doctests: this will do what we mean: there is a known bug, and it's deemed minor enough to ignore it.

Ok. I'll change it.

saraedum commented 10 years ago
comment:14

Replying to @nthiery:

Replying to @saraedum: Now, if we would want a real fix we could have polynomial rings check whether there base ring elements can be hashed, and insert __hash__ = None in their custom element class. This could get tricky for a Cython class though. But this states quite explicitly: I am hashable if my base ring elements are.

Do you know if a Cython class can be marked as being unhashable? __hash__=None does not work in Cython. (Maybe that's what you meant by "This could get tricky for a Cython class though."?)

Btw., setting tp_hash does not work either:

sage: cython('''
cdef extern from "Python.h":
    long PyObject_HashNotImplemented(object)
    ctypedef struct PyTypeObject:
        long tp_hash(object)

cdef class A(object):
   pass

(<PyTypeObject *> A).tp_hash = PyObject_HashNotImplemented

''')
sage: hash(A())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-83-c87e237c4d02> in <module>()
----> 1 hash(A())

TypeError: unhashable type: '_dev_shm_dot_sage_temp_cslogin02_10081_tmp_o8x_Zq_spyx_0.A'
sage: isinstance(A(),collections.Hashable)
True

I posted this as a question on cython-user: http://permalink.gmane.org/gmane.comp.python.cython.user/11569

nthiery commented 10 years ago
comment:15

Replying to @saraedum:

Do you know if a Cython class can be marked as being unhashable? __hash__=None does not work in Cython. (Maybe that's what you meant by "This could get tricky for a Cython class though."?)

I don't know. That's indeed typically the kind of difficulties I meant. Also that a parent can't have is own "copy" of the class to mark it as unhashable when other parent sharing the same class will want to have its elements hashable.

Thanks for investigating!

saraedum commented 10 years ago

Description changed:

--- 
+++ 
@@ -4,6 +4,6 @@
 tester.assertEqual(type(one.__hash__()), int) 
 tester.assertEqual(one.__hash__(), one.__hash__())

-For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. However, since p-adics overwrite __richcmp__ (in padic_generic_element.pyx) they are not hashable, i.e., hash(one) raises a TypeError. +For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. Eventually, __hash__ will be dropped #11598. In this ticket, the behaviour is changed, so that _test_one() does not check the hash if the element does not inherit from collections.Hashable. However, this does not work for Cython extension classes: ​http://permalink.gmane.org/gmane.comp.python.cython.user/11569. For those, this check has to be skipped explicitly.

The same problem exists for CommutativeAdditiveMonoids._test_zero().

saraedum commented 10 years ago

Description changed:

--- 
+++ 
@@ -4,6 +4,6 @@
 tester.assertEqual(type(one.__hash__()), int) 
 tester.assertEqual(one.__hash__(), one.__hash__())

-For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. Eventually, __hash__ will be dropped #11598. In this ticket, the behaviour is changed, so that _test_one() does not check the hash if the element does not inherit from collections.Hashable. However, this does not work for Cython extension classes: ​http://permalink.gmane.org/gmane.comp.python.cython.user/11569. For those, this check has to be skipped explicitly. +For unhashable objects, such as a p-adic one this should not be tested. It currently works, because __hash__() is inherited and therefore defined. Eventually, __hash__ will be dropped #11895. In this ticket, the behaviour is changed, so that _test_one() does not check the hash if the element does not inherit from collections.Hashable. However, this does not work for Cython extension classes: ​http://permalink.gmane.org/gmane.comp.python.cython.user/11569. For those, this check has to be skipped explicitly.

The same problem exists for CommutativeAdditiveMonoids._test_zero().

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from c4f8c62 to 27e5599

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

65ffe60Merge branch 'develop' into ticket/16250
27e5599Skip checks for hash in _test_one()/test_zero() for unhashable elements
mezzarobba commented 9 years ago

Work Issues: merge conflict

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a40d04fMerge branch 'develop' into t/16250/ticket/16250
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 27e5599 to a40d04f

saraedum commented 9 years ago

Changed work issues from merge conflict to none

saraedum commented 8 years ago
comment:24

fixed in #19016.