Closed Quenouillaume closed 3 years ago
Description changed:
---
+++
@@ -1 +1,3 @@
+AUTHOR: Guillaume Rousseau
+A module for filtered complexes and persistent homology. The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
Commit: 6aa4ece
Author: Guillaume Rousseau
There's no code on this branch
Branch pushed to git repo; I updated commit sha1. New commits:
ef3b2ab | corrected indentation in references |
Apologies for the first ticket that had no code on its branch, this is my first time using trac !
Changed keywords from none to homology, persistent-homology
I would suggest not to create a new top-level package sage.persistent_homology
but rather to nest it
I am not sure, but since this code seems already packaged at https://github.com/Quenouillaume/persil, why not let it be part of the task #31164 ? Also, some 10 years ago i saw some optimized code for persistent homology (by Edelsbrunner iirc), so i wonder whether there is room for interfacing such code within Sage ?
The code at https://github.com/Quenouillaume/persil was a first draft, in particular I used my own homemade versions of the Simplex
and FreeModule
classes ! The code on this ticket's branch replaces those with built-in Sage functionalities.
Replying to @sagetrac-tmonteil:
I am not sure, but since this code seems already packaged at https://github.com/Quenouillaume/persil, why not let it be part of the task #31164 ? Also, some 10 years ago i saw some optimized code for persistent homology (by Edelsbrunner iirc), so i wonder whether there is room for interfacing such code within Sage ?
Thank you for adding this code. However, there are a few things that will need to be addressed before this can be included.
sage/homology
folder.__foo
methods. There is no need to name-mangle. Just _foo
is fine. (This will also make it easier to doctest; note that this does not apply to special methods like __getitem__
.)Avoid useless checks here with short-circuiting:
-if j%1000 == 0 and self._verbose:
+if self._verbose and j%1000 == 0:
_numSimplices
-> _num_simplices
.Branch pushed to git repo; I updated commit sha1. New commits:
805ba8e | almost 100% PEP8 compatible |
Added doctests for all methods, and adapted to PEP8. A few lines are still more than 80 characters long, most of them being in the doctests.
Also moved files and documentation to the homology folders.
Branch pushed to git repo; I updated commit sha1. New commits:
8b7666f | fixed documentation |
Description changed:
---
+++
@@ -1,3 +1,9 @@
-AUTHOR: Guillaume Rousseau
+We add support for filtered complexes and persistent homology.
-A module for filtered complexes and persistent homology. The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
+The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
+
+Related:
+
+- [Wikipedia: Persistent homology computation](https://en.wikipedia.org/wiki/Persistent_homology#Computation)
+- [sage-support discussion on persistent homology](https://groups.google.com/g/sage-support/c/a2MS9WALjvo)
+
Some comments:
It might be a good idea to have ways of converting to and from simplicial complexes: for instance, if X is a filtered complex, then a method that takes a value n and returns the complex made up of all simplices with filtration at most n. Also a _simplicial_
method so that SimplicialComplex(X)
works. In addition, take a simplicial complex and make it a filtered complex with everything in filtration 0.
Related to that, I was wondering if it made sense for FilteredComplex
to inherit from SimplicialComplex
. I don't think so, but I thought I'd ask.
Also related to that, would it make sense to have a way of modifying the filtration value for a given simplex? I would suggest renaming get_value
to filtration
. Maybe it could take two arguments: a required argument s
, a simplex, and if that is the only argument, then return its filtration. If an optional argument v
is present, then instead set its filtration to v
.
Throughout, I would change "filtered complex" to "filtered simplicial complex", and I would change the name of the file and the name of the class accordingly. When I see "filtered complex," I tend to think of a filtered chain complex, and then I think that there should be spectral sequences lurking somewhere. (That would be a great task for someone to take on.) To make things unambiguous, we should have FilteredChainComplex
(I hope eventually) and FilteredSimplicialComplex
(from this ticket).
Regarding
This implementation requires filtration values to be positive. Most
features will not work with negative values.
Should we have a warning or error if the user uses negative filtration values? I would also add this comment to the docstring for the __init__
method, so that users will see if when they do FilteredComplex?
.
__init__
method, I like the comments about _vertices
, _simplices
, and _degrees_dict
. Can you add a similar comment about _warnings
? (Or document the warnings
argument.) Is degree
the same as filtration value? I might change that language, at least in the docstring and names of the user-viewable variables (for example simplex_degree_list
could perhaps just be renamed to simplices
).+1 on making this a subclass of SimplicialComplex
and changing the name to FilteredSimplicialComplex
.
Replying to @tscrim:
+1 on making this a subclass of
SimplicialComplex
and changing the name toFilteredSimplicialComplex
.
This might require a lot of work: various methods for SimplicialComplex
would have to be modified to work with FilteredSimplicialComplex
, like product, wedge, join, ..., unless we just forget the filtration. At least FilteredSimplicialComplex
could inherit from GenericCellComplex
, and maybe there could be a new intermediate class that both it and SimplicialComplex
could inherit from.
Hmm...yea, I guess so. Well, I am okay with including this class as-is for now since it will be essentially all implementation details changed by such a refactoring. The only thing I would want would be to change compute_homology()
to persistent_homology()
so it is unambiguous about what type of homology is being computed. (The Betti numbers are fine because of the extra parameters.)
I'm working on implementing all those suggestions, thank you for the very insightful feedback !
I also think that having FilteredSimplicialComplex
inherit from SimplicialComplex
is a bit odd, for all the reasons mentionned above, but I'm not sure it should inherit from GenericCellComplex
either, since they don't really have the same homology type...
Maybe a simple first solution would be, as jhpalmieri suggests, to have a _simplicial_()
method for FilteredSimplicialComplex
, which returns the maximal simplicial complex associated to the filtered complex, as well as a prune(max_value)
method which returns a copy of the filtered complex, where simplices of value more than max_value
have been removed.
Replying to @tscrim:
The only thing I would want would be to change
compute_homology()
topersistent_homology()
so it is unambiguous about what type of homology is being computed. (The Betti numbers are fine because of the extra parameters.)
+1
Branch pushed to git repo; I updated commit sha1. New commits:
fe67d65 | fixed class name, added `_simplicial_` and prune, removed negative constraint |
d2bc95f | removed mention of degree, fused insert and get_value into filtration |
83b78fd | fixed docstring error |
c8dede0 | changed file name, added doc for filtration method |
Branch pushed to git repo; I updated commit sha1. New commits:
8bd967d | PEP8 |
Branch pushed to git repo; I updated commit sha1. New commits:
9301aa1 | added more doc |
I don't really like the fact that compute_persistent_homology()
in some sense fixes the field you are computing the homology over and that it needs to be called before getting the intervals without a meaningful error message. Plus the result is only partially cached as when you run this again, you obtain new data that overwrites the old computed data. So my proposal is to change it as follows:
compute_persistent_homology()
hidden by _compute_persistent_homology()
and then have it return the resulting intervals.intervals()
(perhaps call that persistence_intervals
) that has field
and strict
as part of its key. So it will be decorated by @cached_method(key=lambda self,f,s,v: (f,s))
.intervals
.Also, why can't you just pass _filtration_dict
as a list of simplicies to SimplicialComplex
in _simplicial_
? Related, I also don't see the need for the _simplicies
list as you can get it as the keys of _filtration_dict
. The sorting is only needed during the computation of the persistent homology, which you could construct in there.
Branch pushed to git repo; I updated commit sha1. New commits:
47403e2 | cached computation of homology, made `_simplicial_` simpler, removed some useless attributes |
Replying to @tscrim:
I don't really like the fact that
compute_persistent_homology()
...
I agree, I didn't know about the cached_method decorator and it can probably be very useful here. However there is a problem: since FilteredSimplicialComplex
objects can have their simplices inserted or changed, the cache key needs to take into account whether the complex has changed since last time persistent homology was computed or not.
I was thinking of just counting the number of times that the complex has been modified and have that also be part of the cache key, but maybe there is a simpler / less hacky solution...
In any case, I think that the _compute_persistent_homology
method should be cached rather than the persistence_intervals
method, since it can be called when computing intervals or betti numbers.
Also, why can't you just pass
_filtration_dict
as a list of simplicies toSimplicialComplex
in_simplicial_
? Related, ...
Yes this is much simpler, thank you !
New commits:
47403e2 | cached computation of homology, made `_simplicial_` simpler, removed some useless attributes |
I would just call foo.cached_func.clear_cache()
to clear the cache when the complex is mutated.
Addendum: The way you have it implemented now is effectively a memory leak since you are always adding more stuff to the cache every time you run it.
Branch pushed to git repo; I updated commit sha1. New commits:
bafca26 | simplified cache |
Branch pushed to git repo; I updated commit sha1. New commits:
354b23d | _persistent_homology now properly returns intervals |
Setting a new milestone for this ticket based on a cursory review.
Sorry for loosing track of this; I didn't get any notifications from trac that you pushed changes.
I think all of my and John's current comments have been taken into account, correct?
Can you provide a rebase onto the latest beta? I will continue my review at that point.
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
3e39330 | fixed class name, added `_simplicial_` and prune, removed negative constraint |
66616b7 | removed mention of degree, fused insert and get_value into filtration |
1fd4564 | fixed docstring error |
09aada4 | changed file name, added doc for filtration method |
bfcbc93 | PEP8 |
eb621ff | added more doc |
d0811c7 | cached computation of homology, made `_simplicial_` simpler, removed some useless attributes |
16a8d9b | simplified cache |
bdfd1e9 | _persistent_homology now properly returns intervals |
98f2a1d | Merge branch 'u/gh-Quenouillaume/persistent_homology' of trac.sagemath.org:sage into t/31861/persistent_homology |
I believe I've taken all your comments into account ! And I've just finished rebasing if you wish to keep reviewing.
We add support for filtered complexes and persistent homology.
The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
Related:
CC: @jhpalmieri @slel @tscrim
Component: algebraic topology
Keywords: homology, persistent-homology
Author: Guillaume Rousseau
Branch/Commit:
a70ea5b
Reviewer: John Palmieri, Travis Scrimshaw
Issue created by migration from https://trac.sagemath.org/ticket/31861