Overflow error with maxdim>0

pierre-guillou commented 3 years ago

Hello and thank you for your work!

I am trying to use ripser to compute the persistence diagrams of 3D images. Looking at the Python code in ripser/ripser.py and especially at the ripser.lower_star_img function, I have been able to feed ripser distance matrices corresponding to my datasets. However, when I call the ripser.ripser function with maxdim=1 or above (ideally 2), the program aborts with an overflow error.

This problem also occurs using the Cell.jpg 2D image from the Lower Star Image Filtrations tutorial. When I use maxdim=1 when calling ripser.ripser in the ripser.lower_star_img function (or remove it since the default value is 1), the overflow exception happens here too.

If I understand correctly the underlaying C++ code in ripser/ripser.cpp, this overflow exception is caused by the computation of binomial coefficients that depend on the size of the distance matrix and the maxdim parameter.

Is this an inherent limitation of ripser? What's the right way to compute higher dimension persistence pairs?

Thank you in advance for your help, Best regards, Pierre

sauln commented 3 years ago

This is concerning. Thanks for notifying us. I'll try to take a look within the next week, but am totally swamped with work, so no guarantees. In the meantime if you're able to find the bug and fix, a PR would be immensely appreciated.

ctralie commented 3 years ago

I second what Nathaniel is saying, though I think this may just be an issue beyond the scope of ripser.py. The lower star image filtrations were not designed to support homology above 0D. For homology above that, it may just be blowing up the complex because of how many simplices there are in an image, particularly one in 3D.

On Wed, Apr 28, 2021 at 11:55 AM Nathaniel Saul @.***> wrote:

This is concerning. Thanks for notifying us. I'll try to take a look within the next week, but am totally swamped with work, so no guarantees. In the meantime if you're able to find the bug and fix, a PR would be immensely appreciated.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/scikit-tda/ripser.py/issues/120#issuecomment-828570464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWDZRMTDQ6XQ7PJ6C5VOLTLAVWZANCNFSM43XINLYQ .

pierre-guillou commented 3 years ago

Thank you for answering me about this issue. For reference, here is the Python script I used to trigger this overflow (the Cells.jpg is from here):

import numpy as np
import PIL.Image
from ripser import ripser
from scipy import sparse

def lower_star_img(img):
    m, n = img.shape

    idxs = np.arange(m * n).reshape((m, n))

    I = idxs.flatten()
    J = idxs.flatten()
    V = img.flatten()

    # Connect 8 spatial neighbors
    tidxs = np.ones((m + 2, n + 2), dtype=np.int64) * np.nan
    tidxs[1:-1, 1:-1] = idxs

    tD = np.ones_like(tidxs) * np.nan
    tD[1:-1, 1:-1] = img

    for di in [-1, 0, 1]:
        for dj in [-1, 0, 1]:

            if di == 0 and dj == 0:
                continue

            thisJ = np.roll(np.roll(tidxs, di, axis=0), dj, axis=1)
            thisD = np.roll(np.roll(tD, di, axis=0), dj, axis=1)
            thisD = np.maximum(thisD, tD)

            # Deal with boundaries
            boundary = ~np.isnan(thisD)
            thisI = tidxs[boundary]
            thisJ = thisJ[boundary]
            thisD = thisD[boundary]

            I = np.concatenate((I, thisI.flatten()))
            J = np.concatenate((J, thisJ.flatten()))
            V = np.concatenate((V, thisD.flatten()))

    sparseDM = sparse.coo_matrix((V, (I, J)), shape=(idxs.size, idxs.size))

    return ripser(sparseDM, distance_matrix=True, maxdim=1)["dgms"][0]

if __name__ == "__main__":
    cells_grey = np.asarray(PIL.Image.open("Cells.jpg").convert("L"))
    dgm = lower_star_img(-cells_grey)
    print(dgm)

I copied the lower_star_img method from ripser.py and used maxdim=1 instead of 0. The rest comes from the Lower Star Image Filtrations tutorial.

In the meantime, I've looked into the C++ code and I'm trying to check if raising max_simplex_index to std::numeric_limits<index_t>::max() (https://github.com/scikit-tda/ripser.py/blob/ff3f017e451cbf1d9c31acaa55ab8510826d960c/ripser/ripser.cpp#L83) might be enough.

scikit-tda / ripser.py

Overflow error with maxdim>0 #120