GUDHI / gudhi-devel

The GUDHI library is a generic open source C++ library, with a Python interface, for Topological Data Analysis (TDA) and Higher Dimensional Geometry Understanding.
https://gudhi.inria.fr/
MIT License
258 stars 66 forks source link

[Python - persistence_graphical_tools] numpy (N x 3) a input could be convenient #395

Open VincentRouvreau opened 4 years ago

VincentRouvreau commented 4 years ago

User reports they use numpy to load/save persistence diagrams with dimension information.

np.loadtxt('./rips.pers')                                                                                                                                          

array([[2.       , 0.138335 ,       inf],
       [1.       , 0.104347 ,       inf],
       ...,

       [1.       , 0.11169  , 0.111695 ]])

A patch to make it work:

def _array_handler(a):
    '''
    :param a: if array, assumes it is a (n x 2) or (n x 3) np.array and return a
                persistence-compatible list (padding with 0 or dimension), so that the
                plot can be performed seamlessly.
    '''
    if len(a[0]) == 3:
        if isinstance(a[0][2], np.float64) or isinstance(a[0][2], float):
            return [[int(x[0]), (x[1], x[2])] for x in a]
    if len(a[0]) == 2:
        if isinstance(a[0][1], np.float64) or isinstance(a[0][1], float):
            return [[0, x] for x in a]
    return a

Some documentation to be done.

mglisse commented 4 years ago

3 coordinates: dim, birth, death? That seems similar to a format that giotto likes, it is worth checking if that matches.

Another interesting format, produced for instance by the scikit-tda interface to ripser, is a list of nx2 arrays, one array per dimension.

At some point, with too many formats, it may become dangerous to guess, although for now it still seems doable.

Eventually, we may want to deprecate our current List[Tuple[int,Tuple[float,float]]], which is less convenient than the alternatives.