Open Joel-De opened 3 years ago
If you do not have any edge types as input, then there is no need to use RGCNConv
, but instead use other GNN layers such as GCNConv
or SAGEConv
. You can find an example of classical link prediction here (binary classification), and you may want to modify it into a multi-label edge classification problem. Here, you want to train against ground-truth edge labels as well as the existence of an edge.
@rusty1s Hi, I am currently working on a problem that requires handling multi-label edges. I am able to form custom graph knowledge for my multi-label edge classification problem. Could you please give a hint as to what would be the modifications to the GNN when feeding the data in order for the model to handle multi-label edge data?
Do you mean that you have multi-dimensional edge features, or do you want to perform mulit-label edge classification?
You should be able to train multi-label classification via BCEWithLogitsLoss
, see here:
z = model(x, edge_index)
edge_feat = torch.cat([z[edge_index[0], z[edge_index[1]], dim=-1)
edge_pred = self.mlp(edge_feat)
loss = loss(edge_pred, data.edge_label)
Hi @rusty1s Thanks for your response, this is exactly what I am doing in my code but it always end up crashing (due to index error) at this line z = model(x, edge_index). I believe in multigraph (assuming multiple edges between two nodes) forward and propagation classes have to be edited to cater for multiple edges (instead of one edge ) between each pair of nodes. Is my understanding correct? Is there any example code for multigraph edge classification?
Regards
The index error may result from edge_index
containing invalid indices, i.e. you should confirm that the following runs through:
assert edge_index.min() >= 0
assert edge_index.max() < x.size(0)
Otherwise, duplicated edges will be treated as such by default within PyG.
@rusty1s You are right. Thanks for pointing. Could you please direct me to fixing this issue?
115 assert edge_index.min() >= 0 --> 116 assert edge_index.max() < x.size(0)
AssertionError:
This means that your graph is not correctly encoded. You need to ensure that all indices are correctly mapped to values between 0
and num_nodes - 1
. Let me know if this helps!
@rusty1s Thank you, that worked for me. But the GRUGconv model (that I'm using) for this experiment is giving low accuracy. I have tried changing model parameters and input features (edge features) but the accuracy always end up to be 65%. I am more concerned about getting the same accuracy value after trying several different ways. could my graph model be correct if I am getting the same accuracy values after trying several ways. What could I be missing?
If you are getting the same accuracy despite several changes, this is a sign that your model cannot learn anything. This might be due to the data or the model (hard to say). Is 65% training accuracy the performance you get by always picking the majority class? Why do you need the GRU in the first-place? Does it perform better without it?
I m getting the same accuracy for GConv and Sage as well. The model does show learning (in my opinion) as the loss continues to decrease and the accuracy vs epoch and loss vs epoch plots are correct but it always converges to same accuracy value. Could you give a hint as to what steps can be done to check / debug the issue?
GCNConv
and SAGEConv
are similar operators so it is to be expected that they converge to a similar performance. To verify the gains of graph-based machine learning, you can try to replace your model via a simple MLP (torch_geometric.nn.MLP
). In case this also converges to the same accuracy, there is definitely something fishy going on.
I replaced to torch_geometric.nn.MLP and unfortunately, I am getting same accuracy. :( I am lost at this point. Where could be the issue in the code? Could you please direct me to the possibilities which may lead to this? Thank you for your guidance.
Interesting. At this point I think you should validate your existing features or engineer better ones. Can you confirm that all your features contain reasonable values, e.g. there do not exist any highly skewed values.
Sorry for late response, I got busy with other work. Yes, I did feature engineering and have no skewed data now. The accuracy did improve up to 69% and now always converges to this point no matter what changes I make to hyperparameters. I doubt that this is due to mlp on edges as the classification is done at the edges (which remains same). Could I be correct in my doubt? @rusty1s
Good to see the model performance is increasing. Can you clarify what you mean by "is due to mlp on edges as the classification is done at the edges (which remains same)"?
@rusty1s I would like to use RGCN for a link prediction task with just one node type but 5 different edge types.
edge_index
parameter of the forward function? If I understand correctly, I'll have five different edge stores for the five edge types - do I just stack them together?edge_type
just the edge label in [0, n-1] for the corresponding entry in edge_index
?num_relations
parameter for the RGCN layer be initialized with 5+1=6 relations?Thank you for your time!
Yes, you can just concatenate them, and edge_type
represents the edge type from 0 to 4 for every edge. For negative samples, I assume you want to have this in the output (not in the graph/input to the network), so you would map to 6 possible classes.
Thanks @rusty1s! Right, I am only including negative samples during training, so I need to map to 6 different classes, where 0 corresponds to the negative edge or no edge case. This makes sense to me. However, just to clarify, does this mean I need to initialize the num_relations
param with 6? Looks like initializing with the true number of classes, i.e. 5 doesn't work, since I get an index out of bounds exception. Initializing with 6 works, which kinda makes sense because it doesn't look like the RGCN layer would implicitly know if negative samples are present or not.
Is there any specific reason you want to use negative edges for message passing? Ideally, you just want to train against them (via edge_label_index
/edge_label
), while keeping the original graph for message passing (edge_index
/edge_type
).
No, you are right. I don't want to use negative edges for message passing. From what I understand, if I have 5 positive edges and add negative edges during training, then edge_label
will have values in [0, 5] assuming I use 0 as the label for the negative edge/no edge scenario. That's where I get the error with initializing the RGCN layer:
self.conv1 = RGCNConv(input_size, hidden_size, num_relations=6)
If I pass 5 to num_relations
, given the number of positive edges, I get the index out-of-bounds error.
If it helps, I don't necessarily split my edges into message or supervision edges. My training set is a list of multiple small independent graphs, and I am not training on disjoint message/supervision edges. Each training batch is just one graph in the training set, and I use gradient accumulation to speed up training. Although this might sound counterintuitive, this is okay given that I am only training the graph neural net to learn rich representations that capture relational info, which is then used for a completely different downstream task.
Maybe I am implementing this incorrectly, but if I look at this example here: https://github.com/pyg-team/pytorch_geometric/blob/84ce7fe14d0fecaa9421fdfd122e1503b47530bc/examples/rgcn_link_pred.py#L90C1-L95C68
Does this mean that on line 90 data.edge_type
corresponds to message passing edges? And data.train_edge_type
on lines 92 and 95 are the supervision edges? If so, then for my setup both of these should be the same, I think. In this case data.edge_type
and data.train_edge_type
for my setup should only differ by the negative edges.
Or maybe I don't understand what you mean by training against negative edges. Am I not supposed to have a label for the negative edge and be able to predict it?
Does this mean that on line 90 data.edge_type corresponds to message passing edges? And data.train_edge_type on lines 92 and 95 are the supervision edges?
Yes. IMO, you should make sure that data.edge_type
do not contain negative samples.
Thank you so much! This makes sense to me now. Just one last follow-up if it's okay. Looking at the example again: https://github.com/pyg-team/pytorch_geometric/blob/e213c297bb2aeb9ac50db258f5ab01ea11aea349/examples/rgcn_link_pred.py#L92C1-L95C68
Why is the same data.edge_train_type
(which I understand is just the positive edge labels) passed to the decoder for both positive and negative edge prediction? Shouldn't we pass just a tensor of zeros as the negative edge labels to the decoder for negative edges?
Here in the example, we just assume random edge_type
for the negatives, so we re-use edge_train_type
when computing the score for negative links.
❓ Questions & Help
Is there an example or can someone show me a simple model architecture of RGCN for edge classification? I've looked the documentation and examples for everything, but looking at the output shapes from the layers it seems to be that the RGCN layer is using relational information of a graph to improve node classification. Is there a way to use RGCN to perform edge classification or do I need to approach it from another angle? Currently, for the inputs of the RGCN layer, the edge labels are required as one of the parameters but I'm trying to solve for those labels (using edge_index and feature vectors))so I'm not sure how to go about doing that.
Thanks