DavidLapous / multipers

[NeurIPS2023,ICML2024] Multiparameter Persistence for Machine Learning
https://www-sop.inria.fr/members/David.Loiseaux/doc/multipers/index.html
MIT License
11 stars 1 forks source link
cpp cython icml-2024 multiparameter-persistence neurips-2023 persistent-homology python tda topological-data-analysis

multipers : Multiparameter Persistence for Machine Learning

PyPI Downloads Documentation
Scikit-style PyTorch-autodiff multiparameter persistent homology python library. This library aims to provide easy to use and performant strategies for applied multiparameter topology.
Meant to be integrated in the Gudhi library.

Multiparameter Persistence

This library allows to compute several representation from "geometrical datasets", e.g., point clouds, images, graphs, that have multiple scales. A well known example is is the following one.
Pick a point cloud that has diffuse noise, or on which the sampling measure has some interesting properties, e.g., in the following example the measure has three modes.
Now define a two parameter grid (filtration) of topological spaces (on the left) from a point cloud $P$ on which we will compute the persistence of some topological structures (homological cycles). This filtration $X$, indexed over a radius parameter $r$ and a codensity parameter $s$ is defined as follows

$$ X{r,s} = \bigcup{x \in P, \, \mathrm{density}(x) \ge s} B(x,r) = \lbrace x\in \mathbb R^2 \mid \exists p \in P, \, \mathrm{density}(p) \ge s \text{ and } d(x,p) \le r \rbrace$$

The green shape on the left represent the lifetime of the biggest annulus. On the right, each cycle appearing on the left gets a colored shape (the color is only a label) and the shape of this colored shape represents the lifetime of this cycle.
In our case, the big green shape on the left corresponds to the largest green shape appearing on the right, recovering the structure of the annulus here. Alt text

The magic part is that we never had to choose any parameter to remove the noise in this construction, but the annulus still naturally appears!
A more strinking example is the following one. Using the same constructions, we can identify topological structure, and their size, in a parameter free approach, even though the majority of the sampling measure's mass is noise.
In this example, the lifetime shape associated to each cycle can be identified from

Notice that this construction is also very stable w.r.t. the noise. The more noise is added the smaller the "rainbow strip" is, and the more the "large shape" are visible. alt text We also provide several other descriptors, as well as associated Machine Learning technics and pipelines. In the following example from the same dataset, the Hilbert decomposition signed measure, the Euler decomposition signed measure and the rank decomposition signed measure. alt text

A non-exhaustive list of features can be found in the Features section, and in the documentation.

Quick start

This library is available on PyPI for Linux and macOS, via

pip install multipers

We recommend Windows user to use WSL.
A documentation and building instructions are available here.

Features, and linked projects

This library features a bunch of different functions and helpers. See below for a non-exhaustive list.
Filled box refers to implemented or interfaced code.

If I missed something, or you want to add something, feel free to open an issue.

Authors

David Loiseaux,
Hannah Schreiber (Persistence backend code),
Luis Scoccola (Möbius inversion in python, degree-rips using persistable and RIVET),
Mathieu Carrière (Sliced Wasserstein)

Contributions

Feel free to contribute, report a bug on a pipeline, or ask for documentation by opening an issue.
In particular, if you have a nice example or application that is not taken care in the documentation (see the ./docs/notebooks/ folder), please contact me to add it there.