ysig / GraKeL

A scikit-learn compatible library for graph kernels
https://ysig.github.io/GraKeL/
Other
588 stars 96 forks source link

Error when fitting graph kernels #40

Closed paulamartingonzalez closed 4 years ago

paulamartingonzalez commented 4 years ago

I am trying to work on graph classification and I keep getting this error when fitting different kernels. Is there any guesses of what might be happening? I am using weighted edges and node features

/usr/local/lib/python3.6/dist-packages/grakel/kernels/shortest_path.py in lhash_labels(S, u, v, args) 497 498 def lhash_labels(S, u, v, args): --> 499 return (args[0][u], args[0][v], S[u, v]) 500 501

KeyError: 0

giannisnik commented 4 years ago

Hi @paulamartingonzalez . Can you give us an example of your graphs (even an artificial example). Are the node features multidimensional and/or continuous? If this is the case, then, the shortest path kernel that you try to compute cannot handle them.

paulamartingonzalez commented 4 years ago

Hi @giannisnik , many thanks for your quick reply!

Here is an example of my adjacency matrix:

array([[0. , 1.07610993, 1.69921904], [1.07610993, 0. , 1.65961635], [1.69921904, 1.65961635, 0. ]])

and node features:

{'0': [0.45333222075213203, 0.666666666666667], '1': [0.11486123061688, 0.833333333333333], '2': [0.00543828557391973, 0.666666666666667]}

I build them as: p = Graph(initialization_object=Adj_M,node_labels=feats_dict)

I tried MultiscaleLaplacian Kernel that as I thought it should be able to handle continuous node attributes and I get the following error:

/usr/local/lib/python3.6/dist-packages/grakel/kernels/multiscale_laplacian.py in parse_input(self, X) 177 x.desired_format(self._graph_format) 178 else: --> 179 raise TypeError('each element of X must be either a ' 180 'graph or an iterable with at least 1 ' 181 'and at most 3 elements\n')

TypeError: each element of X must be either a graph or an iterable with at least 1 and at most 3 elements

And same with the SubgraphMatching:

/usr/local/lib/python3.6/dist-packages/grakel/graph.py in get_labels(self, label_type, purpose, return_none)

737                             return None
738                         else:

--> 739 raise ValueError('Graph does not have any labels for edges.') 740 return self.index_edge_labels 741 else:

ValueError: Graph does not have any labels for edges.

Is it not possible to have multidimensional continuous node features?

Thanks in advance!

giannisnik commented 4 years ago

@paulamartingonzalez the multiscale Laplacian kernel can indeed handle continuous node features. The error you get is due to a bug that was recently fixed. To make the code run, just create the graphs as lists of adjacency matrices and node attributes without constructing Graph objects. See the following example: import numpy as np from grakel.kernels import MultiscaleLaplacianFast

adj = np.array([[0. , 1.07610993, 1.69921904], [1.07610993, 0. , 1.65961635], [1.69921904, 1.65961635, 0. ]]) node_attributes = {0: [0.45333222075213203, 0.666666666666667], 1: [0.11486123061688, 0.833333333333333], 2: [0.00543828557391973, 0.666666666666667]} G1 = [adj, node_attributes]

adj = np.array([[0. , 1.2, 2.1], [0.9, 0. , 1.8], [1.5, 0.5, 0. ]]) node_attributes = {0: [0.3, 0.6], 1: [0.1, 0.9], 2: [0.01, 0.8]} G2 = [adj, node_attributes]

Gs = [G1,G2]

ml_kernel = MultiscaleLaplacianFast(P=2, L=1, n_samples=2) K = ml_kernel.fit_transform(Gs)

Other kernels you can use is the PropagationAttr kernel and the GraphHopper kernel. The subgraph matching kernel requires both node attributes and edge attributes. You can use this kernel if you initialize the attributes of all the edges to the same value, e.g., np.array([1]). Note however that the complexity of this kernel is high.

paulamartingonzalez commented 4 years ago

That's brilliant, it works perfectly with the three mentioned kernels! Thank you @giannisnik !

paulamartingonzalez commented 4 years ago

Apologies for reopening the issue but I am having problems with the mentioned format you suggested above in bigger examples:

feats_dict = {'0': [0.0895097750195317, 1.0], '1': [0.00628627722849397, 0.8888888888888891], '2': [0.0252031718194117, 1.0], '3': [0.0528488565274871, 1.0], '4': [0.30002651582661, 1.0], '5': [0.13705942371091, 1.0], '6': [0.007772687607645259, 0.8888888888888891], '7': [0.0380466865017688, 1.0], '8': [0.0733257078444458, 1.0], '9': [0.0111790447265329, 1.0]}

Adj_M = array([[0. , 1.13138636, 0.58252333, 0.8773196 , 0.61798391, 0.60219924, 1.45846769, 0.69422459, 1.27050802, 1.58767049], [1.13138636, 0. , 0.23395952, 1.63015232, 1.54247732, 1.32162261, 1.21141635, 1.15069717, 0.93539068, 0.54741303], [0.58252333, 0.23395952, 0. , 1.73541449, 1.32459676, 0.93977327, 1.50681407, 1.1305574 , 0.89943544, 0.71481475], [0.8773196 , 1.63015232, 1.73541449, 0. , 0.32754151, 0.8368807 , 0.87374159, 0.74595006, 1.26707514, 1.69824053], [0.61798391, 1.54247732, 1.32459676, 0.32754151, 0. , 0.16530014, 1.57052335, 1.35333576, 0.64273896, 1.34170717], [0.60219924, 1.32162261, 0.93977327, 0.8368807 , 0.16530014,

  1. , 1.83282337, 1.66718805, 0.30492003, 0.91574163], [1.45846769, 1.21141635, 1.50681407, 0.87374159, 1.57052335, 1.83282337, 0. , 0.33027513, 1.66711515, 1.24966903], [0.69422459, 1.15069717, 1.1305574 , 0.74595006, 1.35333576, 1.66718805, 0.33027513, 0. , 1.98641497, 1.74397715], [1.27050802, 0.93539068, 0.89943544, 1.26707514, 0.64273896, 0.30492003, 1.66711515, 1.98641497, 0. , 0.24826848], [1.58767049, 0.54741303, 0.71481475, 1.69824053, 1.34170717, 0.91574163, 1.24966903, 1.74397715, 0.24826848, 0. ]])

Just to check:

Gs = [G1,G1]

ml_kernel = MultiscaleLaplacianFast(P=2, L=1,normalize=True) K = ml_kernel.fit_transform(Gs)

Returns the following error:

/usr/local/lib/python3.6/dist-packages/grakel/kernels/multiscale_laplacian.py in (.0) 183 A = x.get_adjacency_matrix() 184 try: --> 185 phi = np.array([list(phi_d[i]) for i in range(A.shape[0])]) 186 except TypeError: 187 raise TypeError('Features must be iterable and castable '

KeyError: 0

giannisnik commented 4 years ago

@paulamartingonzalez i think you get this error because the keys in the feats_dict dictionary should be integers and not strings. For instance, feats_dict = {0: [0.0895097750195317, 1.0], 1: [0.00628627722849397, 0.8888888888888891],...}.

paulamartingonzalez commented 4 years ago

Oh, thanks for spotting! :) It now works, thanks so much!