Open videlec opened 8 years ago
Dependencies: #21272
I don't have product
in my version of Sage, but by replacing that line with
sage: values = [S(list(t)) for t in cartesian_product([[0,1/2,1,2],] * 5)]
I was able to reproduce the issue. I need a fix for this in order to deal with hashes of ideals; see #21297.
Right product
comes from the itertools
module.
Description changed:
---
+++
@@ -2,6 +2,7 @@
sage: S.
To make this more specific, in case this helps to chase this down:
sage: R.<t> = QQ[]
sage: u = 2*t^3 + t^2; v = t^3 + 2*t + 1
sage: hash(u), hash(v)
(29698519856292282, 29698519856292282)
The key bit of code is in the function hash_c
(omitting some comments):
for i from 0<= i <= self.degree():
if i == 1:
# we delay the hashing until now to not waste it on a constant poly
var_name_hash = hash(self._parent._names[0])
c_hash = hash(self[i])
if c_hash != 0:
if i == 0:
result += c_hash
else:
# Hash (self[i], generator, i) as a tuple according to the algorithm.
result_mon = c_hash
result_mon = (1000003 * result_mon) ^ var_name_hash
result_mon = (1000003 * result_mon) ^ i
result += result_mon
which seems just not to work at all. Probably a complete rethink is in order here.
(related: I did something for nf elements at #23402)
Once #23402 is positively reviewed I will port the corresponding code here
Let's make #23402 a dependency then.
Changed dependencies from #21272 to #21272, #23402
Actually what I did in #23402 has to be rethought since it does not apply to sparse polynomials... the hashing function needs to involve the degree of the monomials in some way.
So, is #23402 then a dependency or not?
Description changed:
---
+++
@@ -9,3 +9,8 @@
sage: len(values)
1024
+ +Though hashing in this situation is delicate since we want to preserve the compatibility with: + +- sparse polynomials +- multivariate polynomials
Changed dependencies from #21272, #23402 to #23402
Changed dependencies from #23402 to none
Let us remove the dependency as there is no coercion/hash compatibility involved between polynomial rings and number fields.
Ping. Something has changed in the interim, as my example from comment:4 is no longer applicable:
sage: R.<t> = QQ[]
....: u = 2*t^3 + t^2; v = t^3 + 2*t + 1
....: hash(u), hash(v)
(-7745487741690187698, -7745487741690187694)
but there is still an overall issue:
sage: S.<t> = QQ[]
sage: from itertools import product
sage: values = [S(t) for t in product((0,1/2,1,2), repeat=5)]
sage: len(set(map(hash,values)))
491
sage: len(values)
1024
Replying to @kedlaya:
Ping. Something has changed in the interim, as my example from comment:4 is no longer applicable:
This is because of the switch to Python 3, as strings (of the variable names) are hashed differently. With a Python 2 based Sage, it still behaves as before.
Also note that this
# Hash (self[i], generator, i) as a tuple according to the algorithm.
result_mon = c_hash
result_mon = (1000003 * result_mon) ^ var_name_hash
result_mon = (1000003 * result_mon) ^ i
result += result_mon
essentially copies the implementation of hashing of tuples in Python 2. This was found to have many collisions on simple input and was replaced by a better hash function (xxHash) in Python 3.8; see https://bugs.python.org/issue34751. I guess we could just adopt this new tuple hash here to get better collision resistance.
Sounds reasonable. Does it also fix
sage: hash(-1) == hash(-2)
True
No, that is still the same. IIRC, it was mentioned somewhere that this pattern is baked too deeply into Python as well as third-party code to change it at this point.
Then using the tuple hash is probably not what you want to do (below with Python 3.8.0)
>>> from itertools import product
>>> for p in product([-1,-2], repeat=3): print(hash(p))
-6133384102531166807
-6133384102531166807
-6133384102531166807
-6133384102531166807
-6133384102531166807
-6133384102531166807
-6133384102531166807
-6133384102531166807
This is probably difficult to avoid in general, since the hashing function for polynomials should incorporate the hash values of its coefficients. As the hash values of -1 and -2 are the same, all polynomials that differ only in terms of coefficients of -1 or -2 will necessarily have the same hash.
Cython automatically returns -2 if a hashing function returns -1, so to avoid this, one would have to add a separate hashing function (which Cython does not know about) to the elements of all coefficient rings – or find some other way of obtaining hash values of -1.
Nevertheless, the problem in the ticket description is independent from this -1/-2 issue, so hashing could still be improved a lot for polynomials not involving coefficients of -1.
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
Setting new milestone based on a cursory review of ticket status, priority, and last modification date.
--- a/sage/rings/polynomial/polynomial_element.pyx
+++ b/sage/rings/polynomial/polynomial_element.pyx
@@ -1246,7 +1246,7 @@
TypeError: unhashable type: 'sage.rings.padics.qadic_flint_CR.qAdicCappedRelativeElement'
"""
- cdef long result = 0 # store it in a c-int and just let the overflowing additions wrap
+ cdef long result = 0xb1f740b4 # store it in a c-int and just let the overflowing additions wrap
cdef long result_mon
cdef long c_hash
cdef long var_name_hash
@@ -1265,12 +1265,10 @@
result += c_hash
else:
# Hash (self[i], generator, i) as a tuple according to the algorithm.
- result_mon = c_hash
+ result_mon = c_hash ^ result
result_mon = (1000003 * result_mon) ^ var_name_hash
result_mon = (1000003 * result_mon) ^ i
result += result_mon
- if result == -1:
- return -2
return result
def _im_gens_(self, codomain, im_gens, base_map=None):
sage: S.<t> = QQ[]
sage: from itertools import product
sage: values = [S(t) for t in product((0,1/2,1,2), repeat=5)]
sage: len(set(map(hash,values)))
996
sage: len(values)
1024
--- a/sage/rings/polynomial/polynomial_element.pyx
+++ b/sage/rings/polynomial/polynomial_element.pyx
@@ -65,6 +65,8 @@
import operator
import copy
import re
+import sys
+import xxhash
from io import StringIO
from sage.cpython.wrapperdescr cimport wrapperdescr_fastcall
@@ -1256,32 +1258,22 @@
TypeError: unhashable type: 'sage.rings.padics.qadic_flint_CR.qAdicCappedRelativeElement'
"""
- cdef long result = 0 # store it in a c-int and just let the overflowing additions wrap
- cdef long result_mon
- cdef long c_hash
- cdef long var_name_hash
cdef int i
+ xx = xxhash.xxh3_64()
for i from 0<= i <= self.degree():
if i == 1:
# we delay the hashing until now to not waste it on a constant poly
var_name_hash = hash(self._parent._names[0])
+ xx.update(var_name_hash.to_bytes(8, byteorder="big", signed=True))
# I'm assuming (incorrectly) that hashes of zero indicate that the element is 0.
# This assumption is not true, but I think it is true enough for the purposes and it
# it allows us to write fast code that omits terms with 0 coefficients. This is
# important if we want to maintain the '==' relationship with sparse polys.
c_hash = hash(self.get_unsafe(i))
if c_hash != 0:
- if i == 0:
- result += c_hash
- else:
- # Hash (self[i], generator, i) as a tuple according to the algorithm.
- result_mon = c_hash
- result_mon = (1000003 * result_mon) ^ var_name_hash
- result_mon = (1000003 * result_mon) ^ i
- result += result_mon
- if result == -1:
- return -2
- return result
+ xx.update(c_hash.to_bytes(8, byteorder="big", signed=True))
+ xx.update(hash(3*(i + 7)).to_bytes(8, byteorder="big"))
+ return int.from_bytes(xx.digest(), byteorder="big") % sys.maxsize
def _im_gens_(self, codomain, im_gens, base_map=None):
"""
sage: S.<t> = QQ[]
sage: from itertools import product
sage: values = [S(t) for t in product((0, 1 / 2, 1, 2), repeat=5)]
sage: len(set(map(hash, values)))
1024
sage: len(values)
1024
A lot of conflicts:
Though hashing in this situation is delicate since we want to preserve the compatibility with:
CC: @jdemeyer
Component: algebra
Issue created by migration from https://trac.sagemath.org/ticket/21284