GUDHI / gudhi-devel

The GUDHI library is a generic open source C++ library, with a Python interface, for Topological Data Analysis (TDA) and Higher Dimensional Geometry Understanding.
https://gudhi.inria.fr/
MIT License
255 stars 65 forks source link

Kernel dies while computing persistence #1141

Open MarioAuditore opened 1 day ago

MarioAuditore commented 1 day ago

I tried to play around with covariance matrices from geomstats:

import gudhi as gd  
import geomstats.datasets.utils as data_utils

data, patient_ids, labels = data_utils.load_connectomes()

dgm = gd.RipsComplex(
    distance_matrix = data[0],
    max_edge_length = 1.0
).create_simplex_tree(max_dimension = 2)

dgm.compute_persistence()

However this code just instantly kills the jupyter kernel. Python 3.11.4 on MacOS, M1 pro

DavidLapous commented 21 hours ago

Hi Mario, from what I'm looking at, your data[0] looks like a correlation matrix, which does not look like distance matrix ---I think the segfault comes from here. --> I agree that Gudhi should handle that better. A solution can be to give a distance between points given by 1-data[0] instead, i.e., points that are close together in terms of correlation will be connected sooner.

mglisse commented 18 hours ago

Indeed. The code will only look at the (strict) lower triangular matrix, but you have negative values in this matrix, so we end up with edges that have a filtration value smaller than that of the vertices (0), which violates the definition of a filtration. We could add some tests to avoid crashing the Python interpreter. Testing if a SimplexTree is a proper filtration may be a bit expensive though.