ysig / GraKeL

A scikit-learn compatible library for graph kernels
https://ysig.github.io/GraKeL/
Other
587 stars 96 forks source link

Normalized RandomWalk returning k(G,G) = -1 #8

Closed dskoda closed 5 years ago

dskoda commented 5 years ago

Hi,

I am using GraKeL as installed from the repository (grakel-dev, version 0.1a4). When playing around with RandomWalk, I found that the normalized kernel of a graph G with itself, k(G,G), is sometimes equals to -1. Here's a minimum working example to reproduce the result:

from grakel import RandomWalk

rw_kernel = RandomWalk(normalize=True)

A = [[0, 1, 1, 1],
     [1, 0, 1, 1],
     [1, 1, 0, 1],
     [1, 1, 1, 0]]
labels_A = {0: 'A', 1: 'B', 2: 'C', 3: 'D'}
print(rw_kernel.fit_transform([[A, labels_A]])) # [[1.]]

B = [[0, 1, 1, 1, 1],
     [1, 0, 1, 1, 1],
     [1, 1, 0, 1, 1],
     [1, 1, 1, 0, 1],
     [1, 1, 1, 1, 0]]

labels_B = {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'E'}
print(rw_kernel.fit_transform([[B, labels_B]])) # [[-1.]]

Why is this happening?

Thanks in advance for your support.

ysig commented 5 years ago

You should try a smaller lambda. Theoretically geometric random walk converges if 1/lamda is bigger than the maximum eigenvalue of the kronecker product of the adjacency matrices of g1, g2. Only if the kernel converges it is psd. In another case the kernel value is not of a valid kernel and some other implementation set its value to zero.

A rule of thumb as noted by Viswanathan is setting lamda to the largest power of 10 which is smaller than 1/d^2, where d being the largest degree in the dataset.

B = [[0, 1, 1, 1, 1],
     [1, 0, 1, 1, 1],
     [1, 1, 0, 1, 1],
     [1, 1, 1, 0, 1],
     [1, 1, 1, 1, 0]]

rw_kernel_bl = RandomWalk(method_type="baseline", lamda=0.01)
rw_kernel_f = RandomWalk(lamda=0.01)

print(rw_kernel_bl.fit_transform([[B]])) # [[29.76190476]] -> [[1.0]] after normalization
print(rw_kernel_f.fit_transform([[B]])) # [[29.76190476]] -> [[1.0]] after normalization

Please install the latest grakel as I spotted an incosistency with the baseline. Thanks a lot for taking our library into consideration :+1:

dskoda commented 5 years ago

Got it. New version installed and example working. The only difference to your example is that the result of print(rw_kernel_f.fit_transform([[B]])) is [[0.04761905]].

Thank you very much for your support and for the library!