GUDHI / gudhi-devel

The GUDHI library is a generic open source C++ library, with a Python interface, for Topological Data Analysis (TDA) and Higher Dimensional Geometry Understanding.
https://gudhi.inria.fr/
MIT License
258 stars 66 forks source link

Default behavior of persistence computation doesn't include last degree by default #878

Open DavidLapous opened 1 year ago

DavidLapous commented 1 year ago

Gudhi doesn't compute degree 1 homology of a 1-dimensional simplextree by default. Minimal example :

st = gd.SimplexTree()
st.insert([0,1,2],1)
st.remove_maximal_simplex([0,1,2])
st.persistence()

will return

[(0, (1.0, inf))]

instead of

[(0, (1.0, inf)),(1, (1.0, inf))]

@hschreiber pointed me out the culprit :

Line 277 in Persistent_Cohomology.h

where the condition can be removed.

Are there use cases where it's preferable not to do this computation ? I'm not sure this default behavior is ideal.

mglisse commented 1 year ago

This is controlled by an option: https://gudhi.inria.fr/python/latest/simplex_tree_ref.html#gudhi.SimplexTree.compute_persistence.params.persistence_dim_max , which you can set to True.

In order to compute homology in dimension k for a Rips filtration, you build the k+1 skeleton (higher dimensional simplices don't matter). However, this creates plenty of homology in dimension k+1, and you don't want to waste time on it (I don't remember if we measured how much time is saved, an extreme case might be a complete graph where we only wanted H0 but could end up needlessly computing H1).

When you compute persistence for an alpha-complex or a cubical complex, homology in the maximal dimension is trivial, so persistence_dim_max shouldn't have any effect (except in the periodic case for the last d-cell).

The best default is debatable. I agree that computing everything is more intuitive, and it would make sense to require special cases like the Rips that do not want it to specify that. However, changing the default one would mostly result in a significant slow-down of existing code computing Rips persistence. So I think we should at least wait until #691 is merged so we have some nicer recommendation for computing Rips persistence before we consider changing the default.

DavidLapous commented 1 year ago

oups, I just noticed this flag ... It only matters when, e.g., playing with graphs then 👍 . My bad

mglisse commented 1 year ago

It also matters in higher dimension, it is confusing if you build the 4 triangles that form the boundary of a tetrahedron, and H2 does not appear in the diagram. But in most cases, we would want True for a SimplexTree built by hand and False for one built automatically, and the second use case is more frequent (although I am not sure that much thought was given to the default option).

DavidLapous commented 1 year ago

However, changing the default one would mostly result in a significant slow-down of existing code computing Rips persistence

Is adding a flag in the python class such as "is the complex generated by a Rips, Alpha, or cubical" a possibility ? I suppose it doesn't add much overhead

mglisse commented 1 year ago

Is adding a flag in the python class such as "is the complex generated by a Rips, Alpha, or cubical" a possibility ?

So, I generate a Rips Complex then add a vertex, and that somehow enables the computation of H2? That's also going to be confusing. Without much thought, my current preference would be to have functions like in #691 (using False by default) for people who just want the diagram of a Rips filtration, and then change the default of SimplexTree to True.

mglisse commented 1 year ago

I think we can keep this issue open, so we remember to reconsider the default of persistence_dim_max once the new Persistence classes are ready.