in GRATIS/Graph Data/Graph Classification/nets/superpixels_graph_classification/gated_gcn_net.py line 40:
kfea = kfea.unsqueeze(1).repeat(1, N, 1) #[B,N,D]
that means Key Matrix has the same row, and because Values Matrix has the same value with Key, Values Matrix has the same row,too.
in GRATIS/Graph Data/Graph Classification/nets/superpixels_graph_classification/gated_gcn_net.py line 40:
kfea = kfea.unsqueeze(1).repeat(1, N, 1) #[B,N,D]
that means Key Matrix has the same row, and because Values Matrix has the same value with Key, Values Matrix has the same row,too.In GRATIS/Graph Data/Graph Classification/layers/cross_attention_layer.py,you use LinearAttention function in line 96,and which is from """Linear Transformer proposed in "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" Modified from: https://github.com/idiap/fast-transformers/blob/master/fast_transformers/attention/linear_attention.py"""
and in line 132, you use FullAttention.
But no matter which one you use, the queried_values equels to Values. That means, your attention mechanism doesn't work at all.