giotto-ai / giotto-tda

A high-performance topological machine learning toolbox in Python
https://giotto-ai.github.io/gtda-docs
Other
847 stars 173 forks source link

simplex index 9223716361969802160 in filtration is larger than maximum index 36028797018963967 #678

Open jbeuria opened 1 year ago

jbeuria commented 1 year ago

Describe the bug

File "/home/server/.local/lib/python3.10/site-packages/gtda/homology/simplicial.py", line 1126, in _weak_alpha_diagram Xdgms = ripser(dm, maxdim=self._max_homology_dimension, File "/home/server/.local/lib/python3.10/site-packages/gph/python/ripser_interface.py", line 603, in ripser_parallel res = _compute_ph_vr_sparse( File "/home/server/.local/lib/python3.10/site-packages/gph/python/ripser_interface.py", line 51, in _compute_ph_vr_sparse ret = gph_ripser.rips_dm_sparse(I, J, V, I.size, N, coeff, OverflowError: simplex index 9223716361969802160 in filtration is larger than maximum index 36028797018963967

To reproduce

Steps to reproduce the behavior: The number points ~ 1 million in point cloud causes so.

matteocao commented 4 months ago

@jbeuria do you have any clue how this can be fixed? It seems a generic error, possibly requiring to change the type of some indices. If you have ideas and can help with a PR, that would be great.

BrandonJRobinson commented 2 weeks ago

When trying to compute VietorisRipsPersistence with homology_dimensions > 3, I've run into a the same issue:

OverflowError: simplex index 9223642139012799036 in filtration is larger than maximum index 36028797018963967

To reproduce: take a relatively small number of samples (>10000) but a large number of features (~1000).

jbeuria commented 2 weeks ago

@jbeuria do you have any clue how this can be fixed? It seems a generic error, possibly requiring to change the type of some indices. If you have ideas and can help with a PR, that would be great.

Unfortunately, one has to reduce number of simplices getting formed. maximum filtration parameter can be reduced or a different kind of simplicial complex could be used.