Tableau hash depends on subclass

darijgr commented 9 years ago

Found by Nicolas and Anne:

sage: t = StandardTableaux([3,2,1]).an_element()
sage: tt = Tableau(t[:])
sage: t == tt
True
sage: hash(t) == hash(tt)
False

Objects that compare as equal should have equal hashes.

The branch fixes this by factoring the hash through the underlying list (or, rather, tuple) that represents the tableau. Is that a good idea, or should we preserve something more about the class?

CC: @nthiery @anneschilling @sagetrac-sage-combinat @tscrim

Component: combinatorics

Keywords: tableaux, hashing

Author: Darij Grinberg

Branch/Commit: public/combinat/tableau_hash @ 2778870

Issue created by migration from https://trac.sagemath.org/ticket/19055

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

c0a33fd Hash functions for tableaux and skew tableaux without reference to their classes

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 4eae243 to c0a33fd

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from c0a33fd to 2778870

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2778870 None does not have a fixed hash

stumpc5 commented 9 years ago

comment:4

Comparing to the hash of GelfandTsetlinPatterns, I see that there also the inner lists are turned into tuples

def _hash_(self):
   return hash(tuple(map(tuple, self)))

On the other hand, I would prefer not to see

sage: a = StandardTableau([[1,2,3]])
sage: b = GelfandTsetlinPattern([[1,2,3]])
sage: hash(a) == hash(b)
True

This is, I would prefer to keep some information about the class, though this is not done in GTPs. On the other hand, this is currently not done in many other places either, such as Permutations, Partitions, and Compositions.

Christian

anneschilling commented 9 years ago

comment:5

I agree with Christian, this needs some more discussion. For example, do we really want the behavior that

sage: Tableau([[1]]) == SemistandardTableau([[1]])
True

Don't we only want things to equal if they have the same parent?

tscrim commented 9 years ago

comment:6

I recall Nicolas telling me that having equality and the hash dependent on the parent was the goal in order to better conform to the hash requirement. However, this means we have to be much more careful about our return types. Plus it might be surprising for a user, say, a combinatorics student using Sage for a class.

Here's a possible approach I just thought of. We put a private attribute in the parent class that points to a particular class, and we use it to compare equality and the hash of that class in the element's __hash__. So it would preserve some information about the parent, but it would be controllable. So we could still have, e.g., tableaux and semistandard tableaux compare as equal, but have a be different than GT patterns. A variant of this would be to make it a method, so once can return instances of the parent class if necessary. Thoughts?

nthiery commented 9 years ago

comment:7

To be precise, I indeed in favor of:

Making sure that each object specifies clearly what equality means.
Having hash be consistent with that definition of equality to conform to the Python specifications. Otherwise we can get all sort of weird, if not non-deterministic behavior when building sets of dictionaries of mixed objects. Note that some people consider that this is to be weighted against usability of equality.
Distinguishing syntactical equality (to be tested by ==) and mathematical equality. See also http://wiki.sagemath.org/EqualityCoercion for some comments on the topic. Alas in Sage we don't have (yet?) a separate idiom for mathematical equality, so we are kind of stuck here.

So now what needs to be decided is what equality should mean for tableau-like objects.

I would tend to consider that, when A and B are two parents where A is naturally a subset of B (operations on elements don't depend, or not too much, on whether the elements are considered as in A or as in B), and both parents use the same data representation for their elements (no non-trivial embedding), it can be ok to consider a with A as parent or a with B as parent as equal, if that's what the user would expect.

With that rule of thumb, that is ok::

sage: A = StandardTableaux([2,1])
sage: B = StandardTableaux()
sage: A([[1,2],[3]]) == B([[1,2],[3]])
True

But a tableau and a Gelfand Tsetlin Pattern would not be equal (same data representation, but no canonical subset relation). Nor would I consider the partition 321 as equal to the permutation 321. I probably would not want either to consider a skew tableau with trivial inner shape equal to the corresponding tableau (different data representations).

Now all of this is a just a preliminary rule of thumb. The most important is to be well defined, as consistent as possible within a given context (e.g. tableau-like objects, permutation like objects, ...), and to advertise the specs to the user.

Cheers,

darijgr commented 9 years ago

comment:8

I like your idea, Nicolas; but I'd like to have this structure reflected in code somewhere. Something along the lines of: Every array-like class C has a pointer to the largest array-like class D such that C is transparent over D (meaning x == D(x) for every instance x of C). For tableaux of various sorts, this would be Tableau; for Gelfand-Tsetlin patterns it would probably be GelfandTsetlinPattern or something like this, unless GelfandTsetlinPattern itself is a subclass of Tableau, in which case we would have to introduce a proper subclass for actual tableaux to get the Gelfand-Tsetlin patterns out. (Sorry, I don't have Sage open right now, and I don't remember the hierarchy.)

fchapoton commented 8 years ago

comment:9

one failing doctest

sagemath / sage

Tableau hash depends on subclass #19055