On the representative_cycles branch

ulupo commented 4 years ago

First of all, thank you @ubauer for the amazing work done with ripser! (Opening an issue gives me a good excuse to finally say this directly!)

It seems to me that the quantum leap in computational runtimes made possible by ripser is an important piece in the history of making TDA more appealing and accessible to non-topologists. As you know, this is true in particular in the Python community (that I am most familiar with) thanks to the great effort by you, @ctralie, @sauln (and others maybe?) in making solid bindings in ripser.py. It is now much less esoteric to suggest that persistent-homology--derived feature extraction can be made an integral part of machine learning pipelines. Several projects are now exploring how to provide "regular data scientists" with plug-in topological components which can be used alongside more conventional machine learning toolkits. scikit-tda, GUDHI and giotto-tda (the latter of which I am involved with) are some such examples.

When interacting with non-topologists who are data science practitioners, I find that arguing in favour of TDA based on its ability to describe geometric structures that are in principle "there to see" tends to be successful. It brings the field very close to, say, clustering and other data-viz techniques which everyone would agree are useful!

Persistence diagrams/barcodes are great of course, but they are to the full persistent homology calculation a bit (or a lot, for H_0) as returning a cut dendrogram in single-linkage clustering would be to returning an actual clustering of the data. Representative (persistent) cycles, if visualized, could make (Vietoris-Rips) persistent homology a much more "immediate" concept to grapple with for many, not to mention the actual insight into the data that they would bring.

Sorry for the long spiel which contains little new information to you (mostly there to provide context for other readers), but I hope it helps me segue effectively into the real questions. I noticed that you mention the experimental representative_cycles branch, which is great! I am wondering: What would be your quick assessment on the status of progress there? Is there a hoped-for release date, or is work there still in a research phase? Do you expect that final runtimes will compare favourably with Eirene?

I doubt I could contribute much to the C++ codebase, but would love to eventually help the Python community integrate these developments.

Thanks for the patience in reading!

ubauer commented 2 years ago

The approach in the representative_cycles branch seems to be the most reasonable option for the time being. It uses persstent cohomology to compute the birht/death pairs, and then persistent homology with clearing using those pairs to obtain representative cycles. But I am hoping that there might be an even more efficient way to achieve this in the future. We are currently looking into this.

AWildSugar commented 1 year ago

Hi, I'd also like to thank all contributors for the awesome package. I didn't want to open a new issue for this question, but I was wondering whether you were considering returning 'representatives' for 0-dim cycles? I'm imagining something like Eirene.jl where it just returns a single vertex for each? Not sure if the current lack of this is an algorithmic limitation or something else.

Thanks!

Ripser / ripser

On the representative_cycles branch #33