Closed antonioaa1979 closed 3 years ago
@AlexandrMelnic can you take a look at this error thrown by plot_subgraph
? I feel it would be easier since you know the code better :D
Thanks
Hi! sure I will take a look, @antonioaa1979 if you don't mind can you send me your code? So I can understand better the problem.
thanks both! @AlexandrMelnic, reg the code: I build a dataset and a model pretty much following the citation_gcn example script with some customization of the Dataset class, and after that i run the GNNExplainer lines that you see above. I could send the code if you want, but not the data (it's company dataset I am not authorized to export), so not sure that will help. But i can reproduce the issue any time locally and can run commands for you and export the output, if that works.
hi @AlexandrMelnic doing some debugging, if this can help, the issue happens within _explainer_cleaning function, specifically at those lines:
selected_subgraph shape: (1653, 1653) selected_adj_mask shape: (2349,)
# remove the edges which value is < a_thresh
selected_adj_mask = tf.where(
selected_subgraph.values >= a_thresh, selected_subgraph.values, 0
)
selected_subgraph shape: (1653, 1653) selected_adj_mask shape: (2589,)
Specifically, selected_adj_mask shape is changing after the above line, and it's making following line to fail:
selected_subgraph = tf.sparse.map_values(
tf.multiply, self.comp_graph, selected_adj_mask
)
since comp_graph has a shape of (2349,)
-> tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2349] vs. [2589] [Op:Mul]
Hi, sorry for the delay, in the incoming week I will have more time and I can try to solve this issue. It won't have any sense for the sake of the model but can you try to use it by giving in input the adjacency matrix and not passing the gcn_filter? Furthermore, did you try for other values of a_thresh? Another thing: when you use the correct model with gcn_filter, as you printed the shapes above, can you print the shapes at also the previous steps of _explainer_cleaning of selected_subgraph and selected_subgraph.values.
Thanks @AlexandrMelnic ! Tried with different a_thresh and get same error message (with unchanging "incompatible shapes" values):
explainer = GNNExplainer(model, preprocess=gcn_filter, verbose=True)
adj_mask, feat_mask = explainer.explain_node(x=x_exp, a=a_exp, node_idx=node_idx)
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx)
-> tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx, a_thresh=0.001)
-> tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx, a_thresh=0.5)
-> tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
Also tried without gcn_filter preprocessing, and got same error message (but with changing "incompatible shapes" values):
explainer = GNNExplainer(model, verbose=True)
adj_mask, feat_mask = explainer.explain_node(x=x_exp, a=a_exp, node_idx=node_idx)
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx)
-> tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1088] vs. [1304] [Op:Mul]
Hope this helps, Antonio
I see.....
Can you try this, so I can understand at which point is the problem:
def explainer_cleaning(adj_mask, a_thresh):
selected_adj_mask = tf.nn.sigmoid(adj_mask)
# convert into a binary matrix
comp_graph_values = tf.ones_like(exp.comp_graph.values)
comp_graph = tf.sparse.SparseTensor(
exp.comp_graph.indices, comp_graph_values, exp.comp_graph.shape
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
#get the final masked adj matrix
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
# impose the symmetry of the adj matrix
selected_subgraph = (
tf.sparse.add(selected_subgraph, tf.sparse.transpose(selected_subgraph)) / 2
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
# remove the edges which value is < a_thresh
selected_adj_mask = tf.where(
selected_subgraph.values >= a_thresh, selected_subgraph.values, 0
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
explainer_cleaning(adj_mask, 0.1)
It should take in input the masked adjacency matrix that is returned from the explain_node method.
here we go..
exp = explainer
explainer_cleaning(adj_mask, 0.1)
(1653, 1653) (2739,) (2739,)
(1653, 1653) (2739,) (2739,)
(1653, 1653) (2739,) (2739,)
(1653, 1653) (2739,) (2955,)
Traceback (most recent call last):
File "<ipython-input-47-f9dee555135b>", line 1, in <module>
explainer_cleaning(adj_mask, 0.1)
File "<ipython-input-35-75bc29f78de7>", line 31, in explainer_cleaning
tf.multiply, comp_graph, selected_adj_mask
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/ops/sparse_ops.py", line 2931, in map_values
op(*inner_args, **inner_kwargs),
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py", line 530, in multiply
return gen_math_ops.mul(x, y, name)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6240, in mul
_ops.raise_from_not_ok_status(e, name)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 6897, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
extra elements:
selected_adj_mask = tf.nn.sigmoid(adj_mask)
# convert into a binary matrix
comp_graph_values = tf.ones_like(exp.comp_graph.values)
comp_graph = tf.sparse.SparseTensor(
exp.comp_graph.indices, comp_graph_values, exp.comp_graph.shape
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
# get the final masked adj matrix
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape)
# impose the symmetry of the adj matrix
selected_subgraph = (
tf.sparse.add(selected_subgraph, tf.sparse.transpose(selected_subgraph)) / 2
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape, selected_subgraph.shape,
selected_subgraph.values.shape)
(1653, 1653) (2739,) (2739,) (1653, 1653) (2739,) (2739,) (1653, 1653) (2739,) (2739,) (1653, 1653) (2955,)
# remove the edges which value is < a_thresh
selected_adj_mask = tf.where(
selected_subgraph.values >= a_thresh, selected_subgraph.values, 0
)
print(comp_graph.shape, comp_graph.values.shape, selected_adj_mask.shape, selected_subgraph.shape,
selected_subgraph.values.shape)
(1653, 1653) (2739,) (2955,) (1653, 1653) (2955,)
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
Can you try again with this:
def explainer_cleaning(adj_mask, explainer, a_thresh):
selected_adj_mask = tf.nn.sigmoid(adj_mask)
# convert into a binary matrix
comp_graph_values = tf.ones_like(explainer.comp_graph.values)
comp_graph = tf.sparse.SparseTensor(
explainer.comp_graph.indices, comp_graph_values, explainer.comp_graph.shape
)
print('1 comp_graph shape:', comp_graph.shape)
print('1 comp_graph values shape: ', comp_graph.values.shape)
print('1 selected_adj_mask shape: ', selected_adj_mask.shape)
#get the final masked adj matrix
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print('2 selected_subgraph shape: ', selected_subgraph.shape)
print('2 selected_subgraph values shape: ', selected_subgraph.values.shape)
is_nonzero = tf.not_equal(selected_subgraph.values, 0)
selected_subgraph = tf.sparse.retain(selected_subgraph, is_nonzero)
print('2 selected_subgraph shape after retain: ', selected_subgraph.shape)
print('2 selected_subgraph values shape after retain: ', selected_subgraph.values.shape)
# impose the symmetry of the adj matrix
selected_subgraph = (
tf.sparse.add(selected_subgraph, tf.sparse.transpose(selected_subgraph)) / 2
)
print('3 selected_subgraph shape: ', selected_subgraph.shape)
print('3 selected_subgraph values shape: ', selected_subgraph.values.shape)
is_nonzero = tf.not_equal(selected_subgraph.values, 0)
selected_subgraph = tf.sparse.retain(selected_subgraph, is_nonzero)
print('3 selected_subgraph shape after retain: ', selected_subgraph.shape)
print('3 selected_subgraph values shape after retain: ', selected_subgraph.values.shape)
# remove the edges which value is < a_thresh
selected_adj_mask = tf.where(
selected_subgraph.values >= a_thresh, selected_subgraph.values, 0
)
print('4 selected_adj_mask shape: ', selected_adj_mask.shape)
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print('5 selected_subgraph shape: ', selected_subgraph.shape)
print('5 selected_subgraph values shape: ', selected_subgraph.values.shape)
is_nonzero = tf.not_equal(selected_subgraph.values, 0)
selected_subgraph = tf.sparse.retain(selected_subgraph, is_nonzero)
It can happen that some values of the sparse matrices are not zero, in this way we could solve.
Otherwise I am thinking the elements on the diagonal of the a matrix can be a problem, for this reason can you convert your a matrix into a binary adjacency one with this function:
def binary_adj_converter(a_in):
"""
Transforms a graph matrix into the binary adjacency matrix.
**Arguments**
- `a_in`: sparse `(n_nodes, n_nodes)` graph tensor;
"""
a_idx = a_in.indices
off_diag_idx = tf.not_equal(a_idx[:, 0], a_idx[:, 1])
a_idx = a_idx[off_diag_idx]
a = tf.sparse.SparseTensor(
a_idx, tf.ones(a_idx.shape[0], dtype=tf.float32), a_in.shape
)
return a
After you converted the a matrix feed it into the explainer method without the gcn_filter. You can also run again the function above so we can see the shapes. Thanks
the first test leads to this:
explainer_cleaning(adj_mask, explainer, a_thresh=0.1)
1 comp_graph shape: (1653, 1653)
1 comp_graph values shape: (2739,)
1 selected_adj_mask shape: (2739,)
2 selected_subgraph shape: (1653, 1653)
2 selected_subgraph values shape: (2739,)
2 selected_subgraph shape after retain: (1653, 1653)
2 selected_subgraph values shape after retain: (2739,)
3 selected_subgraph shape: (1653, 1653)
3 selected_subgraph values shape: (2955,)
3 selected_subgraph shape after retain: (1653, 1653)
3 selected_subgraph values shape after retain: (2955,)
4 selected_adj_mask shape: (2955,)
Traceback (most recent call last):
...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
the second test doesn't run for me. I build the adjacency matrix as a a COO sparse matrix:
a = sp.coo_matrix((weights[:, 2], (weights[:, 0], weights[:, 1])), shape=shape, dtype=weights.dtype)
when i run the binary converter i get this error in the first line of the function:
a = binary_adj_converter(a)
File "<ipython-input-33-39e085bf0893>", line 104, in binary_adj_converter
a_idx = a_in.indices
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/scipy/sparse/base.py", line 687, in __getattr__
raise AttributeError(attr + " not found")
AttributeError: indices not found
But if i convert with my own procedure the adjacency matrix to binary, i still see the issue:
a = sp.coo_matrix((np.ones(len(weights)), (weights[:, 0], weights[:, 1])), shape=shape, dtype=weights.dtype) # Binary
...
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2739] vs. [2955] [Op:Mul]
Ok the problem is in the transpose operation I guess. I will try to fix it. Thank you for the collaboration!
thank you.. feel free to ping me for testing the fix
Thanks for working on this @AlexandrMelnic, let me know if I can speed up testing/pushing the fix.
Cheers
Let's see if this works. I did rewrite the function a little bit, there was no a real reason to have that transposition at that point. Can you try this function just as before:
def explainer_cleaning(explainer, adj_mask, a_thresh):
selected_adj_mask = tf.nn.sigmoid(adj_mask)
print('1 selected_adj_mask ', selected_adj_mask.shape)
comp_graph_values = tf.ones_like(explainer.comp_graph.values)
comp_graph = tf.sparse.SparseTensor(
explainer.comp_graph.indices, comp_graph_values, explainer.comp_graph.shape
)
print('2 comp_graph.values ', comp_graph.values.shape)
selected_adj_mask = tf.where(
selected_adj_mask >= a_thresh, selected_adj_mask, 0
)
print('3 selected_adj_mask ', selected_adj_mask.shape)
selected_subgraph = tf.sparse.map_values(
tf.multiply, comp_graph, selected_adj_mask
)
print('4 selected_subgraph ', selected_subgraph.values.shape)
is_nonzero = tf.not_equal(selected_subgraph.values, 0)
selected_subgraph = tf.sparse.retain(selected_subgraph, is_nonzero)
print('5 selected_subgraph ', selected_subgraph.values.shape)
selected_subgraph = (
tf.sparse.add(selected_subgraph, tf.sparse.transpose(selected_subgraph)) / 2
)
print('6 selected_subgraph ', selected_subgraph.values.shape)
return selected_subgraph
Hi @AlexandrMelnic the function doesn't produce errors:
explainer_cleaning(explainer, adj_mask, a_thresh=0.1)
1 selected_adj_mask (2739,)
2 comp_graph.values (2739,)
3 selected_adj_mask (2739,)
4 selected_subgraph (2739,)
5 selected_subgraph (2739,)
6 selected_subgraph (2955,)
Out[22]: <tensorflow.python.framework.sparse_tensor.SparseTensor at 0x15a138f90>
Pls let me know once the fix is committed and available, if you dont mind ;)
Hi @antonioaa1979 ,
I've merged @AlexandrMelnic's change into the develop
branch, can you verify that it solves your issue by installing the latest version from source?
git clone https://github.com/danielegrattarola/spektral.git
cd spektral
git checkout develop
pip install -U .
Thanks
Thanks both! Hi @danielegrattarola I have followed above procedure, but got following error when training the model now:
history = model.fit(loader_tr.load(), steps_per_epoch=loader_tr.steps_per_epoch, validation_data=loader_va.load(),
validation_steps=loader_va.steps_per_epoch, epochs=epochs,
callbacks=[EarlyStopping(patience=patience, restore_best_weights=True),])
Traceback (most recent call last):
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-14-39d8faee841b>", line 4, in <module>
history = model.fit(loader_tr.load(), steps_per_epoch=loader_tr.steps_per_epoch, validation_data=loader_va.load(),
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/spektral/data/loaders.py", line 240, in load
output = self.collate(self.dataset)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/spektral/data/loaders.py", line 221, in collate
output = to_disjoint(**packed)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/spektral/data/utils.py", line 52, in to_disjoint
a_out = sp.block_diag(a_list)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/scipy/sparse/construct.py", line 688, in block_diag
data.append(a.ravel())
AttributeError: 'SparseTensor' object has no attribute 'ravel'
Not sure if this is due to the develop branch? How can i verify i have installed the latest develop branch correctly? This is what i got from above pip install command:
spektral % pip install -U .
...
Building wheels for collected packages: spektral
Building wheel for spektral (setup.py) ... done
Created wheel for spektral: filename=spektral-1.0.6-py3-none-any.whl size=123388 sha256=1240b7b30ad0b0252736dc3b830279cfc5a6b2e22f5a98dc80db97570e006c13
Stored in directory: /private/var/folders/g9/yz6yp_5n1l3cs72t4_1bsl_w0000gn/T/pip-ephem-wheel-cache-hq4fzoi8/wheels/77/05/37/6661e4114ebe416489cd00534974ec6bdc81c919032683bd41
Successfully built spektral
Installing collected packages: spektral
Attempting uninstall: spektral
Found existing installation: spektral 1.0.7
Uninstalling spektral-1.0.7:
Successfully uninstalled spektral-1.0.7
Successfully installed spektral-1.0.6
Thanks
PS: how do i go back to master version?
Yes, that is due to some new changes in develop
.
When creating the dataset, before you had to convert the adjacency matrix from Scipy sparse to SparseTensor. Now the Loader expects a Scipy sparse matrix.
So, if you have the AdjToSpTensor
transform used anywhere, you should remove it and let the adjacent matrix be a scipy sparse or numpy array.
PS: how do i go back to master version?
cd spektral
git checkout master
ok, removed the AdjToSpTensor transform and the model now does fit. But got this error in running explain_node:
adj_mask, feat_mask = explainer.explain_node(x=x_exp, a=a_exp, node_idx=node_idx)
Out[29]:
Traceback (most recent call last):
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-28-0f5c5af0c10f>", line 1, in <module>
adj_mask, feat_mask = explainer.explain_node(x=x_exp, a=a_exp, node_idx=node_idx)
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/spektral/models/gnn_explainer.py", line 107, in explain_node
a, node_idx, self.n_hops, self.preprocess
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/spektral/models/gnn_explainer.py", line 314, in k_hop_sparse_subgraph
if a.dtype != tf.float32:
TypeError: Cannot interpret 'tf.float32' as a data type
a_exp.dtype
Out[30]: dtype('float32')
tf.float32
Out[31]: tf.float32
a_exp.dtype != tf.float32
Out[32]:
Traceback (most recent call last):
File ".pyenv/versions/3.7.6/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-32-ce553183aea3>", line 1, in <module>
a_exp.dtype != tf.float32
TypeError: Cannot interpret 'tf.float32' as a data type
type(tf.float32)
Out[33]: tensorflow.python.framework.dtypes.DType
tf.__version__
Out[34]: '2.5.0'
it seems like fit method expects an adjacency matrix in scipy sparse format, while explainer still expects it as tensorflow sparcetensor?
when you give in input a_exp in the explain_node can you convert it into a sparse tensor? So we can understand if everything else works. Then I can rewrite the function such that it takes in input a scipy sparse matrix.
good one... here we go...
adj_mask, feat_mask = explainer.explain_node(x=x_exp, a=spektral.utils.sparse.sp_matrix_to_sp_tensor(a_exp), node_idx=node_idx)
...
G = explainer.plot_subgraph(adj_mask, feat_mask, node_idx)
plt.show()
all runs smoothly and produces a nice chart! Thanks, i can confirm the bugfix works!
Thanks to both of you for working through this issue.
I will merge the fix into master
as soon as possible.
Cheers
Hi Daniele and all,
thanks for creating and maintaining this great library!
I have been trying to use GNNExplainer, but I keep seeing the below error message. Still don't know if it's a bug or something i am doing wrong on my side, but there is no much documentation or examples around it.
I am able to run smoothly the sample code at https://github.com/danielegrattarola/spektral/blob/master/examples/other/explain_node_predictions.py.
But when applying to my dataset (where I can run successfully a GCN model), i get the below: