Closed mbr085 closed 5 years ago
In short the problem is that the thresh option produces results that are different from the results of the classic command line ripser.
from ripser import ripser, plot_dgms
import itertools
import numpy as np
n=3
segment = [np.linspace(0,1,5)]
endpoints = [np.linspace(0,1,2)]
face = segment * (n - 1) + endpoints
vertices = []
for k in range(n):
vertices.extend(itertools.product(*(face[k:] + face[:k])))
coords = np.array(corners)
rips = ripser(coords, maxdim=2, thresh=.8)
plot_dgms(rips['dgms'])
exporting to a file
np.savetxt('box.txt', X=coords)
and running the Bauer version of ripser from the command line gives a different result.
Sorry for the trouble, and thank you for the detailed report! I will have a look tomorrow
Hello, Here's where I am so far, comparing this to GUDHI:
Code below. 0D and 1D persistent homology looks the same, but GUDHI is missing the 2D dot. I should note that in our ripser.py, the 2D dot is born when all of the 1D dots die. This makes sense from the point cloud you setup, so that seems correct to me. This is also the case from ripser in the command line, but the result is different, as you noted.
I'll dig into this more deeply in a few hours when I get back from some meetings. For now, here's the code I wrote:
from ripser import ripser, plot_dgms
import gudhi
import itertools
import numpy as np
import matplotlib.pyplot as plt
def convertGUDHIPD(pers):
"""
Convert GUDHI persistence diagrams into a list of arrays
"""
Is = {}
for i in range(len(pers)):
(dim, (b, d)) = pers[i]
if not dim in Is:
Is[dim] = []
Is[dim].append([b, d])
Is = [np.array(Is[dim]) for dim in Is]
return Is
n=3
segment = [np.linspace(0,1,5)]
endpoints = [np.linspace(0,1,2)]
face = segment * (n - 1) + endpoints
vertices = []
for k in range(n):
vertices.extend(itertools.product(*(face[k:] + face[:k])))
coords = np.array(vertices)
np.savetxt('box.txt', X=coords)
rips = ripser(coords, maxdim=2)
dgmsrips = rips['dgms']
rips_complex = gudhi.RipsComplex(points=coords,max_edge_length=1e9)
simplex_tree = rips_complex.create_simplex_tree(max_dimension=2)
dgmsgudhi = simplex_tree.persistence(homology_coeff_field=2, min_persistence=0)
dgmsgudhi = convertGUDHIPD(dgmsgudhi)
plt.figure(figsize=(10, 5))
plt.subplot(121)
plot_dgms(dgmsrips, xy_range=[-0.2, 1.5, 0, 1.5])
plt.title("Ripser")
plt.subplot(122)
plot_dgms(dgmsgudhi, xy_range=[-0.2, 1.5, 0, 1.5])
plt.show()
Using GUDHI you need to add simplices up to dimension 3 in order to compute homology in dimension 2. I have taken the liberty to modify your script so that it contains both the GUDHI and the ripser results. You will notice that they agree.
%matplotlib inline
from ripser import ripser, plot_dgms
import gudhi
import itertools
import numpy as np
import matplotlib.pyplot as plt
def convertGUDHIPD(pers):
"""
Convert GUDHI persistence diagrams into a list of arrays
"""
Is = {}
for i in range(len(pers))[::-1]:
(dim, (b, d)) = pers[i]
if not dim in Is:
Is[dim] = []
Is[dim].append([b, d])
Is = [np.array(Is[dim]) for dim in Is]
return Is
n=3
segment = [np.linspace(0,1,5)]
endpoints = [np.linspace(0,1,2)]
face = segment * (n - 1) + endpoints
vertices = []
for k in range(n):
vertices.extend(itertools.product(*(face[k:] + face[:k])))
coords = np.array(vertices)
np.savetxt('box.txt', X=coords)
rips = ripser(coords, maxdim=2)
dgmsrips = rips['dgms']
rips_complex = gudhi.RipsComplex(points=coords) #,max_edge_length=1e9)
simplex_tree = rips_complex.create_simplex_tree(max_dimension=3)
dgmsgudhi2 = simplex_tree.persistence(homology_coeff_field=2)
dgmsgudhi = convertGUDHIPD(dgmsgudhi2)
plt.figure(figsize=(10, 5))
plt.subplot(121)
plot_dgms(dgmsrips, xy_range=[-0.2, 1.5, 0, 1.5])
plt.title("Ripser")
plt.subplot(122)
plot_dgms(dgmsgudhi, xy_range=[-0.2, 1.5, 0, 1.5])
plt.title('GUDHI')
plt.show()
This is in contrast to the situation where a threshold is introduced:
%matplotlib inline
from ripser import ripser, plot_dgms
import gudhi
import itertools
import numpy as np
import matplotlib.pyplot as plt
def convertGUDHIPD(pers):
"""
Convert GUDHI persistence diagrams into a list of arrays
"""
Is = {}
for i in range(len(pers))[::-1]:
(dim, (b, d)) = pers[i]
if not dim in Is:
Is[dim] = []
Is[dim].append([b, d])
Is = [np.array(Is[dim]) for dim in Is]
return Is
n=3
segment = [np.linspace(0,1,5)]
endpoints = [np.linspace(0,1,2)]
face = segment * (n - 1) + endpoints
vertices = []
for k in range(n):
vertices.extend(itertools.product(*(face[k:] + face[:k])))
coords = np.array(vertices)
np.savetxt('box.txt', X=coords)
rips = ripser(coords, maxdim=2, thresh=1.4)
dgmsrips = rips['dgms']
rips_complex = gudhi.RipsComplex(points=coords) #,max_edge_length=1e9)
simplex_tree = rips_complex.create_simplex_tree(max_dimension=3)
dgmsgudhi2 = simplex_tree.persistence(homology_coeff_field=2)
dgmsgudhi = convertGUDHIPD(dgmsgudhi2)
plt.figure(figsize=(10, 5))
plt.subplot(121)
plot_dgms(dgmsrips, xy_range=[-0.2, 1.5, 0, 1.5])
plt.title("Ripser")
plt.subplot(122)
plot_dgms(dgmsgudhi, xy_range=[-0.2, 1.5, 0, 1.5])
plt.title('GUDHI')
plt.show()
Interesting, thanks for the update! I will have a look.
By the way, if you're looking to do 2D homology efficiently in low dimensions, you might check out alpha filtrations: http://gudhi.gforge.inria.fr/doc/latest/group__alpha__complex.html
I still definitely want to fix this bug though, so thanks for pointing it out!
@ubauer I could use your help on this. Do you have any idea why the command line version gives something completely different? (Output below) It doesn't make any sense, because it only has one connected component, but it has a bunch of 1D classes that are born at 0?
I have also noticed that in our wrapper, it seems to be a problem in the sparse matrix class. When I use a dense matrix, it seems to work correctly with all thresholds. Though in the command line version, the answer is the same regardless. Any ideas?
value range: [0,1]
sparse distance matrix with 30 points and 310/449 entries
persistence intervals in dim 0:
[0, )
persistence intervals in dim 1:
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
[0,0.25)
persistence intervals in dim 2:
[0.25,0.5)
[0.25,0.5)
[0.25,0.75)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0.25,0.5)
[0,0.5)
Hi Chris,
It seems that you didn’t specify the input format as --point-cloud
. With that, you should get the expected results.
I don’t know what’s happening in the wrapper, but if i remember correctly you're not using Ripser to compute the distance matrix for a point cloud, but some extra Python code. Maybe the problem is related to that?
Hi Uli, Thanks for the quick response. Ah okay that completely explains it with what's happening in the console.
Yeah I also thought that about the custom code, but the same bug occurs with a sparse distance matrix. Something is going wrong with how I'm interfacing with the sparse distance matrix class in ripser....
Hi all, My sincerest apologies for the delay on this.
I believe I figured out the bug. As I said before, I managed to narrow it down to code that ran through the sparse distance matrix reduction. It was a very helpful hint that this problem didn't exist in the command line version (thank you for your detailed reporting @mbr085). So I carefully cross-referenced every sparse_distance_matrix template in ripser/ripser master with what's in our master. @ubauer somehow when I integrated your sparse matrix branch last spring, there was a line missing:
https://github.com/scikit-tda/ripser.py/commit/d6a68f4435e136b3f1ffd706010d15a03b303d9a
The result was that a single bar splits into a bunch of little bars. So for example, instead of
[0.35355339 1. ]
We'll end up with
[[0.35355338 0.5 ] [0.5 0.559017 ] [0.559017 0.61237246] [0.61237246 0.70710677] [0.70710677 0.75 ] [0.75 0.79056942] [0.79056942 0.82915622] [0.82915622 0.86602539] [0.86602539 0.90138781] [0.90138781 0.93541437] [0.93541437 1. ]]
@ubauer, does it makes sense that we would get such a result with that line missing?
@mbr085 I have confirmed that this fixes the example you provided. If you would pull on the "threshH2bugfix" branch, do a python setup.py install, and confirm that this fixes things: https://github.com/scikit-tda/ripser.py/tree/threshH2bugfix
And if you have any other examples that you found, it would be great if you could test them too. But I think it's solid now. Thanks, Chris
I tried to test it but saw no difference. Is there an easy way to confirm that I am running the updated version of the code?
Try pip uninstall ripser first. I had a similar issue before.
On Tue, Nov 13, 2018 at 3:58 AM Morten Brun notifications@github.com wrote:
I tried to test it but saw no difference. Is there an easy way to confirm that I am running the updated version of the code?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/scikit-tda/ripser.py/issues/51#issuecomment-438187440, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNh5rD2iX-_y1wQUGp-wvIKz_wJAZNIks5uuonJgaJpZM4YD5rz .
If you reinstall the source from the branch now, you should be able to confirm you're running the updated code by checking the version:
python -c "import ripser; print(ripser.__version__)"
>>> 0.3.0.dev0
The output should be 0.3.0.dev0
rather than 0.3.0
as is on the master branch.
A fix has been issued to master. We'll roll out a new version to Pypi asap. If it persists for you @mbr085 , please let us know and reopen the issue.
Thank you.
I confirm that the issue is resolved.
Great! Thank you so much for the detailed report! This was a subtle one because few people use H2 and when I use it I don't use thresholds
On Wed, Nov 14, 2018, 2:14 AM Morten Brun <notifications@github.com wrote:
I confirm that the issue is resolved.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/scikit-tda/ripser.py/issues/51#issuecomment-438561839, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNh5s0FFvZ0U9B6KPZB4l1Yx6hLYbXxks5uu8LtgaJpZM4YD5rz .
Thanks to everyone! Chris, you're right, the sorting is essential, and the behavior we saw is expected.
Please see https://github.com/Ripser/ripser/issues/20