sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.43k stars 479 forks source link

Persistent homology #31861

Closed Quenouillaume closed 3 years ago

Quenouillaume commented 3 years ago

We add support for filtered complexes and persistent homology.

The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.

Related:

CC: @jhpalmieri @slel @tscrim

Component: algebraic topology

Keywords: homology, persistent-homology

Author: Guillaume Rousseau

Branch/Commit: a70ea5b

Reviewer: John Palmieri, Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/31861

Quenouillaume commented 3 years ago

Branch: u/gh-Quenouillaume/persistent_homology

Quenouillaume commented 3 years ago

Description changed:

--- 
+++ 
@@ -1 +1,3 @@
+AUTHOR: Guillaume Rousseau

+A module for filtered complexes and persistent homology. The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
Quenouillaume commented 3 years ago

Commit: 6aa4ece

Quenouillaume commented 3 years ago

Author: Guillaume Rousseau

mkoeppe commented 3 years ago
comment:3

There's no code on this branch

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

621d748formatted code for sage standards and added comments
c7f1f74added more doc and reference to Zomorodian-Carlsson
2d862d6corrected format for documentation
4c97200finished up docstrings
a7635a9added copyright
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 6aa4ece to a7635a9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from a7635a9 to ef3b2ab

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ef3b2abcorrected indentation in references
Quenouillaume commented 3 years ago
comment:6

Apologies for the first ticket that had no code on its branch, this is my first time using trac !

Quenouillaume commented 3 years ago

Changed keywords from none to homology, persistent-homology

mkoeppe commented 3 years ago
comment:8

I would suggest not to create a new top-level package sage.persistent_homology but rather to nest it

edd8e884-f507-429a-b577-5d554626c0fe commented 3 years ago
comment:9

I am not sure, but since this code seems already packaged at https://github.com/Quenouillaume/persil, why not let it be part of the task #31164 ? Also, some 10 years ago i saw some optimized code for persistent homology (by Edelsbrunner iirc), so i wonder whether there is room for interfacing such code within Sage ?

edd8e884-f507-429a-b577-5d554626c0fe commented 3 years ago
comment:10

See https://en.wikipedia.org/wiki/Persistent_homology#Computation

Quenouillaume commented 3 years ago
comment:11

The code at https://github.com/Quenouillaume/persil was a first draft, in particular I used my own homemade versions of the Simplex and FreeModule classes ! The code on this ticket's branch replaces those with built-in Sage functionalities.

Replying to @sagetrac-tmonteil:

I am not sure, but since this code seems already packaged at https://github.com/Quenouillaume/persil, why not let it be part of the task #31164 ? Also, some 10 years ago i saw some optimized code for persistent homology (by Edelsbrunner iirc), so i wonder whether there is room for interfacing such code within Sage ?

tscrim commented 3 years ago
comment:12

Thank you for adding this code. However, there are a few things that will need to be addressed before this can be included.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from ef3b2ab to 96b91f0

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

014e7d4moved into homology folder
5243b3cadapted to PEP8
96b91f0all methods doctested
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

805ba8ealmost 100% PEP8 compatible
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 96b91f0 to 805ba8e

Quenouillaume commented 3 years ago
comment:15

Added doctests for all methods, and adapted to PEP8. A few lines are still more than 80 characters long, most of them being in the doctests.

Also moved files and documentation to the homology folders.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 805ba8e to 8b7666f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

8b7666ffixed documentation
slel commented 3 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,9 @@
-AUTHOR: Guillaume Rousseau
+We add support for filtered complexes and persistent homology.

-A module for filtered complexes and persistent homology. The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
+The algorithm implemented is described in 'Computing Persistent Homology' by A. Zomorodian and G. Carlsson.
+
+Related:
+
+- [Wikipedia: Persistent homology computation](https://en.wikipedia.org/wiki/Persistent_homology#Computation)
+- [sage-support discussion on persistent homology](https://groups.google.com/g/sage-support/c/a2MS9WALjvo)
+
jhpalmieri commented 3 years ago
comment:18

Some comments:

    This implementation requires filtration values to be positive. Most
    features will not work with negative values.

Should we have a warning or error if the user uses negative filtration values? I would also add this comment to the docstring for the __init__ method, so that users will see if when they do FilteredComplex?.

tscrim commented 3 years ago
comment:19

+1 on making this a subclass of SimplicialComplex and changing the name to FilteredSimplicialComplex.

jhpalmieri commented 3 years ago
comment:20

Replying to @tscrim:

+1 on making this a subclass of SimplicialComplex and changing the name to FilteredSimplicialComplex.

This might require a lot of work: various methods for SimplicialComplex would have to be modified to work with FilteredSimplicialComplex, like product, wedge, join, ..., unless we just forget the filtration. At least FilteredSimplicialComplex could inherit from GenericCellComplex, and maybe there could be a new intermediate class that both it and SimplicialComplex could inherit from.

tscrim commented 3 years ago
comment:21

Hmm...yea, I guess so. Well, I am okay with including this class as-is for now since it will be essentially all implementation details changed by such a refactoring. The only thing I would want would be to change compute_homology() to persistent_homology() so it is unambiguous about what type of homology is being computed. (The Betti numbers are fine because of the extra parameters.)

Quenouillaume commented 3 years ago
comment:22

I'm working on implementing all those suggestions, thank you for the very insightful feedback !

I also think that having FilteredSimplicialComplex inherit from SimplicialComplex is a bit odd, for all the reasons mentionned above, but I'm not sure it should inherit from GenericCellComplex either, since they don't really have the same homology type...

Maybe a simple first solution would be, as jhpalmieri suggests, to have a _simplicial_() method for FilteredSimplicialComplex, which returns the maximal simplicial complex associated to the filtered complex, as well as a prune(max_value) method which returns a copy of the filtered complex, where simplices of value more than max_value have been removed.

jhpalmieri commented 3 years ago
comment:23

Replying to @tscrim:

The only thing I would want would be to change compute_homology() to persistent_homology() so it is unambiguous about what type of homology is being computed. (The Betti numbers are fine because of the extra parameters.)

+1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

fe67d65fixed class name, added `_simplicial_` and prune, removed negative constraint
d2bc95fremoved mention of degree, fused insert and get_value into filtration
83b78fdfixed docstring error
c8dede0changed file name, added doc for filtration method
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 8b7666f to c8dede0

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from c8dede0 to 8bd967d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

8bd967dPEP8
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 8bd967d to 9301aa1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9301aa1added more doc
tscrim commented 3 years ago
comment:27

I don't really like the fact that compute_persistent_homology() in some sense fixes the field you are computing the homology over and that it needs to be called before getting the intervals without a meaningful error message. Plus the result is only partially cached as when you run this again, you obtain new data that overwrites the old computed data. So my proposal is to change it as follows:

Also, why can't you just pass _filtration_dict as a list of simplicies to SimplicialComplex in _simplicial_? Related, I also don't see the need for the _simplicies list as you can get it as the keys of _filtration_dict. The sorting is only needed during the computation of the persistent homology, which you could construct in there.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 9301aa1 to 47403e2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

47403e2cached computation of homology, made `_simplicial_` simpler, removed some useless attributes
Quenouillaume commented 3 years ago
comment:29

Replying to @tscrim:

I don't really like the fact that compute_persistent_homology()...

I agree, I didn't know about the cached_method decorator and it can probably be very useful here. However there is a problem: since FilteredSimplicialComplex objects can have their simplices inserted or changed, the cache key needs to take into account whether the complex has changed since last time persistent homology was computed or not. I was thinking of just counting the number of times that the complex has been modified and have that also be part of the cache key, but maybe there is a simpler / less hacky solution... In any case, I think that the _compute_persistent_homology method should be cached rather than the persistence_intervals method, since it can be called when computing intervals or betti numbers.

Also, why can't you just pass _filtration_dict as a list of simplicies to SimplicialComplex in _simplicial_? Related, ...

Yes this is much simpler, thank you !


New commits:

47403e2cached computation of homology, made `_simplicial_` simpler, removed some useless attributes
tscrim commented 3 years ago
comment:30

I would just call foo.cached_func.clear_cache() to clear the cache when the complex is mutated.

Addendum: The way you have it implemented now is effectively a memory leak since you are always adding more stuff to the cache every time you run it.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 47403e2 to bafca26

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bafca26simplified cache
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from bafca26 to 354b23d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

354b23d_persistent_homology now properly returns intervals
mkoeppe commented 3 years ago
comment:33

Setting a new milestone for this ticket based on a cursory review.

tscrim commented 3 years ago
comment:34

Sorry for loosing track of this; I didn't get any notifications from trac that you pushed changes.

I think all of my and John's current comments have been taken into account, correct?

Can you provide a rebase onto the latest beta? I will continue my review at that point.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 354b23d to 98f2a1d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

3e39330fixed class name, added `_simplicial_` and prune, removed negative constraint
66616b7removed mention of degree, fused insert and get_value into filtration
1fd4564fixed docstring error
09aada4changed file name, added doc for filtration method
bfcbc93PEP8
eb621ffadded more doc
d0811c7cached computation of homology, made `_simplicial_` simpler, removed some useless attributes
16a8d9bsimplified cache
bdfd1e9_persistent_homology now properly returns intervals
98f2a1d Merge branch 'u/gh-Quenouillaume/persistent_homology' of trac.sagemath.org:sage into t/31861/persistent_homology
Quenouillaume commented 3 years ago
comment:36

I believe I've taken all your comments into account ! And I've just finished rebasing if you wish to keep reviewing.