sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.32k stars 451 forks source link

simplicial complexes lack hash function #12587

Closed 93808305-b40a-4e51-9e5e-87b8badd3758 closed 11 years ago

93808305-b40a-4e51-9e5e-87b8badd3758 commented 12 years ago

Simplicial complexes are lacking a proper hash function. See the example below.

sage: S = SimplicialComplex([[]]); S
Simplicial complex with vertex set () and facets {()}
sage: hash(S)
-3899221226149827755
sage: S.__hash__??
Source:
    def __hash__(self):
        return hash(self.__repr__())

Apply:

Depends on #13226 Depends on #13244 Depends on #13590

CC: @sagetrac-sage-combinat

Component: combinatorics

Author: Travis Scrimshaw

Reviewer: Christian Stump, John Palmieri

Merged: sage-5.6.beta1

Issue created by migration from https://trac.sagemath.org/ticket/12587

tscrim commented 12 years ago
comment:1

Since SimplicialComplex is mutable, the hash function must depend on something immutable. Because of this, I would like for it to be handled like Graph and throw a TypeError. However there are many dependencies on SimplicialComplex being hashable, so my thought behind setting this to wontfix is that repr is the only immutable part of the class.

jhpalmieri commented 12 years ago
comment:2

Why not instead modify the SimplicialComplex class so that it is immutable?

33c50bab-f304-4398-bab0-d28191788497 commented 12 years ago
comment:3

Being mutable and hashable is definetly a bug. See [1].

I don't understand what you mean with repr is the only immutable part, because it may change when the object changes.

[1] http://docs.python.org/glossary.html#term-hashable

tscrim commented 12 years ago
comment:4

Replying to @sagetrac-sluther:

Being mutable and hashable is definetly a bug. See [1].

I don't understand what you mean with repr is the only immutable part, because it may change when the object changes.

[1] http://docs.python.org/glossary.html#term-hashable

You're correct. I forgot that repr outputs the facets.

However I'm not in favor of making SimplicialComplex an immutable object outright since it contains mutators and there are inductive constructions which would require you to build copies. For example, suppose you had an algorithm which built a simplicial complex facet by facet, at each step you would need to clone the SimplicialComplex object in the mutator. Thus extra memory handling and multiple transient objects.

Perhaps we should do something closer to ClonableArray and have an attribute _is_mutable which we set to True when __hash__() is called and in any mutators, we do something of the following:

if _is_mutable:
    raise TypeError, "This simplicial complex is no longer mutable"

Thus making it immutable as soon as it becomes hashed?

Also we'd have

__hash__(self):
    _is_mutable = True
    return hash(self.facets())

Your thoughts?

33c50bab-f304-4398-bab0-d28191788497 commented 12 years ago
comment:5

Replying to @tscrim:

Perhaps we should do something closer to ClonableArray and have an attribute _is_mutable which we set to True when __hash__() is called and in any mutators, we do something of the following:

if _is_mutable:
    raise TypeError, "This simplicial complex is no longer mutable"

Thus making it immutable as soon as it becomes hashed?

Also we'd have

__hash__(self):
    _is_mutable = True
    return hash(self.facets())

Your thoughts?

I don't like this idea because it violoates the principle of least suprise. For example the question if an object can be modified depends upon the question if it has been added into a set or not. Better add an explicit method that sets this attribute.

tscrim commented 12 years ago
comment:6

Replying to @sagetrac-sluther:

I don't like this idea because it violoates the principle of least suprise. For example the question if an object can be modified depends upon the question if it has been added into a set or not. Better add an explicit method that sets this attribute.

Good point. Then how about this?

def set_immutable(self):
    self._is_mutable = False

def __hash__(self):
    if not _is_mutable:
        raise TypeError, "This simplicial complex must be set to immutable by calling set_immutable()"
    return hash(self.facets())
33c50bab-f304-4398-bab0-d28191788497 commented 12 years ago
comment:7

Replying to @tscrim:

Good point. Then how about this?

def set_immutable(self):
    self._is_mutable = False

def __hash__(self):
    if not _is_mutable:
        raise TypeError, "This simplicial complex must be set to immutable by calling set_immutable()"
    return hash(self.facets())

Fine, just be careful with not reversing the meaning of _is_mutable like in __hash__.

nbruin commented 12 years ago
comment:8

I have no opinion on what the best error types and messages are for this, but it is probably good if they are consistent. Since this is completely analogous to matrices, you should probably follow that example or change it to conform to yours (if you have good arguments to change it):

sage: M=matrix([[1,2],[3,4]])
sage: M.__hash__()
TypeError: mutable matrices are unhashable
sage: M[0,0]=5
sage: M.set_immutable()
sage: M.__hash__()
sage: M[0,0]=5
ValueError: matrix is immutable; please change a copy instead (i.e., use copy(M) to change a copy of M).

On the other hand:

sage: t=(1,2,3)
sage: t[0]=4
TypeError: 'tuple' object does not support item assignment

but then again a ValueError indicates something that something is wrong with the value, not with the type (tuple is never mutable, but matrix can be).

tscrim commented 12 years ago
comment:9

I feel like it should raise a TypeError since making it immutable is, in essence, a class property. Additionally Graph raises a TypeError anytime __hash__() is called.

Also more of a random thought: what if we delete the mutators when we make it immutable?

def is_mutable(self):
    del self.add_face
    del self.remove_face
jhpalmieri commented 12 years ago
comment:10

remove_face is not a mutator.

nbruin commented 12 years ago
comment:11

Replying to @tscrim:

I feel like it should raise a TypeError since making it immutable is, in essence, a class property. Additionally Graph raises a TypeError anytime __hash__() is called.

Sage isn't consistent (yet?) in this regard:

sage: L=Sequence([1,2,3])
sage: hash(L)
ValueError: mutable sequences are unhashable
sage: L.append(4)
sage: L.set_immutable()
sage: hash(L)
...
sage: L.append(4)
ValueError: object is immutable; please change a copy instead.

On the one hand, it's convenient if "hash(...)" always gives the same kind of error on an unhashable object (easier to catch). This is what Matrix does.

One the other, if a class instance can change from being unhashable to being hashable it's obviously not a property of the class, but of this individual object (its "value"). This is what Sequence does.

Don't go deleting methods! That's just a hack. Cython classes can't even do it. If you want to go that way, you should make it a separate category FrozenSimplicialComplex, ala set and frozenset in python itself, but the approach you currently propose is more in line with other sage classes.

tscrim commented 12 years ago
comment:12

Replying to @jhpalmieri:

remove_face is not a mutator.

To be consistant, I changed it into a mutator. Let me know if there are any objections.

Replying to @nbruin:

On the one hand, it's convenient if "hash(...)" always gives the same kind of error on an unhashable object (easier to catch). This is what Matrix does.

One the other, if a class instance can change from being unhashable to being hashable it's obviously not a property of the class, but of this individual object (its "value"). This is what Sequence does.

I've changed everything to raise a ValueError. I'm wondering if we should create a new exception class for mutable objects such as MutableError. Probabily needs a discussion and for sure another ticket.

Don't go deleting methods! That's just a hack. Cython classes can't even do it. If you want to go that way, you should make it a separate category FrozenSimplicialComplex, ala set and frozenset in python itself, but the approach you currently propose is more in line with other sage classes.

That's a good point about the cython class. I'm going to keep with the current methodology.

tscrim commented 12 years ago
comment:13

I'd be appreciative if anyone is willing to review this.

Also, I found another patch by vp (which I presume is vpiluad) in the sage-combinat queue which makes very small tweaks to __hash__() and __cmp__() of Simplex. So vpiluad, would it be okay if I just delete your patch?

tscrim commented 12 years ago

Author: Travis Scrimshaw

jhpalmieri commented 12 years ago
comment:15

Replying to @tscrim:

Replying to @jhpalmieri:

remove_face is not a mutator.

To be consistant, I changed it into a mutator. Let me know if there are any objections.

Why not change add_face so it's not a mutator. Wouldn't that simplify the hashing issue?

tscrim commented 12 years ago
comment:16

Replying to @jhpalmieri:

Replying to @tscrim:

Replying to @jhpalmieri:

remove_face is not a mutator.

To be consistant, I changed it into a mutator. Let me know if there are any objections.

Why not change add_face so it's not a mutator. Wouldn't that simplify the hashing issue?

Doing that would fix it. However some of the methods in SimplicialComplex use add_face() (as I recall _enlarge_subcomplex() is the main culprit), so it'd cut down on the number of transient objects. Also I feel like this is becomes consistent with Graph (and by my understanding Sequence and Matrix).

jhpalmieri commented 12 years ago
comment:17

Replying to @tscrim:

Replying to @jhpalmieri:

Replying to @tscrim:

Replying to @jhpalmieri:

remove_face is not a mutator.

To be consistant, I changed it into a mutator. Let me know if there are any objections.

Why not change add_face so it's not a mutator. Wouldn't that simplify the hashing issue?

Doing that would fix it. However some of the methods in SimplicialComplex use add_face() (as I recall _enlarge_subcomplex() is the main culprit), so it'd cut down on the number of transient objects. Also I feel like this is becomes consistent with Graph (and by my understanding Sequence and Matrix).

For what it's worth, I just searched through simplicial_complex.py, and I don't see add_face() used anywhere except for some doctests.

I think consistency with other classes is secondary. I think the central question is: is there a good reason for simplicial complexes to be mutable?

tscrim commented 12 years ago
comment:18

Replying to @jhpalmieri:

Replying to @tscrim:

Doing that would fix it. However some of the methods in SimplicialComplex use add_face() (as I recall _enlarge_subcomplex() is the main culprit), so it'd cut down on the number of transient objects. Also I feel like this is becomes consistent with Graph (and by my understanding Sequence and Matrix).

For what it's worth, I just searched through simplicial_complex.py, and I don't see add_face() used anywhere except for some doctests.

I think consistency with other classes is secondary. I think the central question is: is there a good reason for simplicial complexes to be mutable?

A priori, no. You're right, I could not find any code either (I looked through both the homology and combinat directories) which used add_face() other than for the doc-tests. I feel like there are many constructions which iteratively build-up simplicial complexes or break them down piece by piece (such as k-decomposeability) and creating copies would create transient objects and inflate the memory usage. This is why I feel that we should have SimplicialComplex be mutable.

stumpc5 commented 11 years ago
comment:19

Replying to @tscrim:

Also, I found another patch by vp (which I presume is vpiluad) in the sage-combinat queue which makes very small tweaks to __hash__() and __cmp__() of Simplex. So vpiluad, would it be okay if I just delete your patch?

My collaborator vpilaud actually opened this ticket (as his very first ticket). You can delete his small patch in the combinat queue concerning the hash of simplicial complexes.

We were/are working on a patch on implementing "subword complexes" (which are simplicial complexes) as defined by Knutson and Miller in [1]. In this context, we needed fast hashing of simplicies and of simplicial complexes for constructing them. We now use a different algorithm, but if there will eventually be an immutable version of simplicial complexes with fast hashing, we might go back to another, faster algorithm.

Anyway, I appreciate your work on this!

Christian

[1] A. Knutson and E. Miller, Subword complexes in Coxeter groups, Adv. Math. 184(1), 2004.

stumpc5 commented 11 years ago
comment:20

Since you propose to make the hash only dependent on the facets:

sage: S = SimplicialComplex([1,2,3],[[1,2]])
sage: S
Simplicial complex with vertex set (1, 2, 3) and facets {(1, 2)}

([1,2,3] is actually not the vertex set, but the ground set, while the vertex set is [1,2].) The proposed implementation only takes the facets into account for hashing, thus we get the same hash for

sage: S = SimplicialComplex([1,2],[[1,2]])
sage: S
Simplicial complex with vertex set (1, 2) and facets {(1, 2)}

Is this the desired behaviour - or should the ground set be used as well for hashing?

(I don't have a strong opinion on this question, but I thought I bring it up anyway. If you don't want to take care of the vertex set vs. ground set issue in this ticket here, I will open another and ask for other peoples opinion there.)

tscrim commented 11 years ago
comment:21

Replying to @stumpc5:

Since you propose to make the hash only dependent on the facets:

sage: S = SimplicialComplex([1,2,3],[[1,2]])
sage: S
Simplicial complex with vertex set (1, 2, 3) and facets {(1, 2)}

([1,2,3] is actually not the vertex set, but the ground set, while the vertex set is [1,2].) The proposed implementation only takes the facets into account for hashing, thus we get the same hash for

sage: S = SimplicialComplex([1,2],[[1,2]])
sage: S
Simplicial complex with vertex set (1, 2) and facets {(1, 2)}

Is this the desired behaviour - or should the ground set be used as well for hashing?

(I don't have a strong opinion on this question, but I thought I bring it up anyway. If you don't want to take care of the vertex set vs. ground set issue in this ticket here, I will open another and ask for other peoples opinion there.)

There are a few possibilities:

  1. This is a bug, in that (3,) should also be listed as a facet.
  2. We handle this more like a matroid (I believe this is what you're suggesting) and add methods like ground_set().
  3. Split the difference and make an optional argument.
  4. We just do a stopgap here and make the hash depend upon the vertex/ground set and push this to another ticket.

My thought is we handle this as a bug, but I'm happy to implement whichever is decided upon. I have no strong opinions either.

Best,

Travis

jhpalmieri commented 11 years ago
comment:22

Is this the desired behaviour?

Well, it's certainly the documented behavior (I mean the documented behavior of simplicial complexes). Note that you can also do

S = SimplicialComplex([[1,2]])

to get a complex with a single facet. So the vertex list is already an optional argument, in some sense.

I think that the hash should only depend on the facets; the unused vertices shouldn't have any effect. That makes the most mathematical sense to me.

stumpc5 commented 11 years ago
comment:23

Replying to @jhpalmieri:

I think that the hash should only depend on the facets; the unused vertices shouldn't have any effect. That makes the most mathematical sense to me.

I agree.

But I still think that mathematically, a vertex is a 0-dimensional face, and thus must be contained in some facet, see [1] or basically any book on simplicial complexes. In the example above, 3 is therefore not a vertex. (In particular, "unused vertices" do not exist.) But this is then another issue, and we can discuss it on another ticket, as Travis suggested as well.

[1] http://en.wikipedia.org/wiki/Abstract_simplicial_complex : "The vertex set of Δ is defined as V(Δ) = ∪Δ, the union of all faces of Δ"

jhpalmieri commented 11 years ago
comment:24

I think it would be absolutely fine to change the documentation of SimplicialComplex to encourage the use of

sage: SimplicialComplex( list of facets )

instead of

sage: SimplicialComplex( vertex set, list of facets )

We could also go further and change the behavior of SimplicialComplex so it ignores vertex_set even if it is present, thus essentially deprecating its use. We could formally deprecate its use, in fact.

tscrim commented 11 years ago
comment:25

Replying to @jhpalmieri:

I think it would be absolutely fine to change the documentation of SimplicialComplex to encourage the use of

sage: SimplicialComplex( list of facets )

instead of

sage: SimplicialComplex( vertex set, list of facets )

We could also go further and change the behavior of SimplicialComplex so it ignores vertex_set even if it is present, thus essentially deprecating its use. We could formally deprecate its use, in fact.

I'd be happy with deprecating vertex_set. I could go through and make those changes on this ticket.

tscrim commented 11 years ago
comment:26

Deprecated vertex_set and made all of the appropriate changes. This is now based on (the to be merged) #13244.

tscrim commented 11 years ago

Dependencies: #13244

tscrim commented 11 years ago
comment:27

Minor tweaks. Ready for review.

tscrim commented 11 years ago
comment:28

Rebased over #13590.

tscrim commented 11 years ago

Changed dependencies from #13244 to #13244 #13590

stumpc5 commented 11 years ago
comment:29

Hi Travis,

I started to review the patch. Everything looks quite well, but you seem to have missed handling the parameters properly, see below:

sage: S = SimplicialComplex(maximal_faces=[[1,4], [2,4]])
...
TypeError: 'NoneType' object is not iterable

and with this

sage: S = SimplicialComplex(vertex_set=[1,2,4],maximal_faces=[[1,4], [2,4]])    
sage: S
Simplicial complex with vertex set (1, 2, 4) and facets {(2, 4), (1, 4)}

The deprecation could maybe as well quickly be mentioned in the init of SimplicialComplex, since vertex_set is still there as an optional argument.

stumpc5 commented 11 years ago

Reviewer: Christian Stump

tscrim commented 11 years ago
comment:30

Thanks for reviewing this. Here's the updated patch fixing the above and with the added documentation in __init__().

stumpc5 commented 11 years ago
comment:31

What about

sage: S = SimplicialComplex(vertex_set=[1,2,3,4],maximal_faces=[[1,4], [2,4]])
/home/stumpc5/progs/sage-5.5.beta1/local/bin/sage-ipython:1: DeprecationWarning: vertex_set is deprecated.
See http://trac.sagemath.org/12587 for details.
  #!/usr/bin/env python
sage: S
Simplicial complex with vertex set (1, 2, 3, 4) and facets {(2, 4), (1, 4)}

sorry to be picky here: I thought we agreed that the input vertex_set is ignored, didn't we ? Mathematically, the above is simply wrong (see comment:22), so I would prefer if we could get rid of

Simplicial complex with vertex set (1, 2, 3, 4) and facets {(2, 4), (1, 4)}

Or do you think that deprecation enough?

tscrim commented 11 years ago
comment:32

I left it there in case someone was relying on that behavior as I thought the agreement was to just deprecate the functionality. However I have no problem completely ignoring it and can make the appropriate changes. Alternatively I can make a more robust warning in the init or class description about it being wrong. Which would you prefer?

stumpc5 commented 11 years ago
comment:33

Replying to @tscrim:

I left it there in case someone was relying on that behavior as I thought the agreement was to just deprecate the functionality. However I have no problem completely ignoring it and can make the appropriate changes. Alternatively I can make a more robust warning in the init or class description about it being wrong. Which would you prefer?

I'd prefer that if the user puts in something mathematically wrong, then he should be getting an error. But you're writing the patch, so you decide. I will finish my review as soon as I have your decision...

tscrim commented 11 years ago
comment:34

I've uploaded the new patch which ignores the vertex_set option. Thanks for the review.

tscrim commented 11 years ago
comment:35

Rebased off of #13226.

tscrim commented 11 years ago

Changed dependencies from #13244 #13590 to #13244 #13590 #13226

stumpc5 commented 11 years ago

Changed dependencies from #13244 #13590 #13226 to none

stumpc5 commented 11 years ago
comment:36

Attachment: trac_12587-simplicial_complex_hash-ts.patch.gz

stumpc5 commented 11 years ago
comment:37

Attachment: trac_12587-simplicial_complex_hash_review-cs.patch.gz

stumpc5 commented 11 years ago
comment:38

Replying to @stumpc5:

Everything looks good to me now -- I fixed a typo and a missing remove of the vertex_set, plus two doc strings.

So: I give it a positive review!

tscrim commented 11 years ago
comment:39

Thank you for the review.

jhpalmieri commented 11 years ago
comment:40

This looks very good. I have a few more changes, though: we should make sure that the various deprecations are documented in the reference manual (which does not include methods like __init__). Also, when I built the documentation, I got a warning about a duplicated reference for [Hat], so I removed one of them. Finally, I added doctests to the __init__ method to test the new deprecations. I think my patch needs review.

jhpalmieri commented 11 years ago

Description changed:

--- 
+++ 
@@ -10,3 +10,11 @@
     def __hash__(self):
         return hash(self.__repr__())

+ +--- + +Apply: + +- attachment: trac_12587-simplicial_complex_hash-ts.patch +- attachment: trac_12587-simplicial_complex_hash_review-cs.patch +- attachment: trac_12587-ref-jhp.patch

tscrim commented 11 years ago
comment:42

I also got the duplicate reference once (a long while ago), but never on any subsequent docbuild. I've looked over you review patch and looks good to me. Thank you for your diligence.

tscrim commented 11 years ago

Changed reviewer from Christian Stump to Christian Stump, John Palmieri

stumpc5 commented 11 years ago
comment:43

Replying to @jhpalmieri:

Also, when I built the documentation, I got a warning about a duplicated reference for [Hat], so I removed one of them.

I also had this problem only the first time I built the documentation.

Now, the reference is missing in the documentation of Simplex.product, which I don't think is the solution either. When I was doing the review, I spent some time to figure out how to solve this issue but didn't manage to do so.

Do you happen to know how to either reference a citation in a different file, or to have the same citation in different files without getting a warning?

Best, Christian

jhpalmieri commented 11 years ago
comment:44

Replying to @stumpc5:

Replying to @jhpalmieri:

Also, when I built the documentation, I got a warning about a duplicated reference for [Hat], so I removed one of them.

I also had this problem only the first time I built the documentation.

I would guess that if you delete the documentation output, or if you somehow force Sage to rebuild the documentation of delta_complex.py, you'll see the warning again.

Now, the reference is missing in the documentation of Simplex.product, which I don't think is the solution either. When I was doing the review, I spent some time to figure out how to solve this issue but didn't manage to do so.

I still see the reference, and if I click on the citation, I get taken to the reference in the other file. Also, most people reading the docstring will know what "Hatcher" means – it's not an obscure reference – so won't even need to look it up.

Do you happen to know how to either reference a citation in a different file

It works for me. Try rebuilding all of the documentation (rm -rf SAGE_ROOT/devel/sage/doc/output/ and then sage --docbuild reference html).