DeepGraphLearning / torchdrug

A powerful and flexible machine learning platform for drug discovery
https://torchdrug.ai/
Apache License 2.0
1.44k stars 200 forks source link

KnowledgeGraphCompletion interpretation #69

Closed Orbifold closed 2 years ago

Orbifold commented 2 years ago

When a KnowledgeGraphCompletion task has been trained on FB15k237 the prediction returns something with shape [*, 2, 14541] with positive and negative numbers while the target has booleans. How is the prediction effectively returning/predicting edges in the graph? All examples emphasize accuracy and scores but none focus on what is being predicted and how it relates to the original data.

Orbifold commented 2 years ago

Unrelated to the question, the method all_prefix_slice shown in the notes uses num_cum_xs while it should be sizes and the parameter is size while it should be sizes.

KiddoZhu commented 2 years ago

Hi! The prediction of KnowledgeGraphCompletion depends on whether it is training or test. It seems you are asking about the test case.

During training, it predicts logits of shape (batch_size, 1 + num_negative) for batch_size positive samples and num_negative negative samples per positive sample. pred[:, 0] are the predictions for positive samples.

During test, for each positive triplet (h, r, t), it predicts the logits for all possible tails (h, r, *) and all possible heads (*, r, t). The predictions are then stacked to form a tensor of shape (batch_size, 2, num_entity), where pred[:, 0, :] is the prediction for all possible tails, and pred[:, 1, :] is the prediction for all possible heads.

Orbifold commented 2 years ago

Thanks for the quick reply 👍

Just to check what you say (and maybe it'll also help others), assuming I take a sample of triples and make predictions like so:

triples = test_set[:10]
preds = task.cpu().predict(triples)
tails = torch.softmax(preds[:,0,:].detach(), dim=1)
heads = torch.softmax(preds[:,1,:].detach(), dim=1)

I extract predicted triples with the highest probability:

top_tails = tails.topk(1)
top_heads = heads.topk(1)

and noting that a triple (head,predicate,tail) is actually [head, tail, predicate] in the dataset:

for i,[h,t,r] in enumerate(triples):
  print("(%s)-[%s]->(%s): %s "%( h.item(),r.item(),top_tails.indices[i].item(), "{:.2%}".format( top_tails.values[i].item())))

this outputs something like

(6180)-[148]->(9): 0.02% 
(1454)-[92]->(9): 0.02% 
(6651)-[13]->(32): 0.02% 
...

All correct?

KiddoZhu commented 2 years ago

That's right!

Note knowledge graph completion models are usually trained with negative sampling & binary cross entropy, so it doesn't make much sense to use softmax to normalize the output. I would suggest using torch.sigmoid for each prediction instead. Nevertheless, topk after softmax is the same as topk after sigmoid, so your code still outputs the correct topk tails and heads.