hsidky / dmaps

C++ Accelerated Python Diffusion Maps Library
MIT License
23 stars 4 forks source link

Basic example: need some clearing explanations #2

Closed teshaTe closed 4 years ago

teshaTe commented 4 years ago

Hello and thank you for providing access to your library! I have been playing around with python version of the library, trying to figure out how it works and repeating the basic example; It seems to be that I am doing something wrong, but I cannnot figure out what as the resulting diffusion map is definately not right. I will be gratefull if you could explain where I am mistaken. Thank you!

Here is the code:

import dmaps import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D

length_phi = 12 length_Z = 12 sigma = 0.1 m = 10000

phi = length_phi np.random.rand(m) xi = np.random.rand(m) Z = length_Z np.random.rand(m) X = 1./6 (phi + sigmaxi) np.sin(phi) Y = 1./6 (phi + sigmaxi) np.cos(phi)

swiss_roll = np.array([X, Y, Z]).transpose() print(swiss_roll.shape)

dist = dmaps.DistanceMatrix(swiss_roll) dist.compute(metric=dmaps.metrics.euclidean) dist.save('distMetr.jpeg')

diffMap = dmaps.DiffusionMap(dist) diffMap.set_kernel_bandwidth(3) diffMap.compute(3)

v = diffMap.get_eigenvectors() w = diffMap.get_eigenvalues()

plt.rcParams["figure.figsize"] = (8, 12) fig = plt.figure() Axes3D ax = fig.add_subplot(211, projection='3d') ax.scatter(swiss_roll[:, 0], swiss_roll[:, 1], swiss_roll[:, 2], c=swiss_roll[:, 1], cmap=plt.cm.get_cmap("Spectral")) ax.set_title("Original data")

ax = fig.add_subplot(212) arr0 = ax.scatter(v[:, 1]/v[:, 0], v[:, 2]/v[:, 0], c=swiss_roll[:, 1], cmap=plt.cm.get_cmap("Spectral")) plt.xlabel('$\Psi_2$') plt.ylabel('$\Psi_3$') plt.title('Projected data') plt.show()

result1

hsidky commented 4 years ago

Hi,

The main issue is that your kernel bandwidth is too large. Per Coifman et al., they suggest twice the slope of a log-log plot of the sum of the similarity matrix. There is a convenience function in dmaps to sum the similarity matrix: diffMap.sum_similarity_matrix(epsilon).

You should get a good result with a kernel bandwidth of about 0.1.

teshaTe commented 4 years ago

Great, Thank you very much for the help! I also read the article from another post: https://sidky.io/blog/diffusion-maps-part-2/ everything become more clear) Again thank you for your help!