amazon-science / tgl

Apache License 2.0
192 stars 31 forks source link

Feature request : link prediction script #2

Open moudheus opened 2 years ago

moudheus commented 2 years ago

Currently TGL only provides training scripts. I would like to predict the top K most likely future links after training.

Example:

Alternatively, I would appreciate documentation on how to achieve this on my own. I believe it is doable by generating the list of candidate edges as a test set, and then extract the scores from the evaluation function.

Thank you.

tedzhouhk commented 2 years ago

Thank you for your interest in our work!

It's not hard to achieve this. Here's the two places you should be looking at:

Lastly, I want to point out that under the current link prediction setup, the top 100 destination nodes for any source node would be the same at the same time, as the edge probability is calculated by adding two "edge scores" from the source and destination nodes. Another setup for GNN edge prediction is to directly learn a embedding for each node pair, which might better serve your need.

moudheus commented 2 years ago

Thank you, I will look into this!

moudheus commented 2 years ago

I implemented a predict script in the following way:

Could you please have a look at my implementation and tell me if anything seems wrong?

You can find it here: https://github.com/moudheus/tgl/blob/main/predict.py

I would be willing to do a cleaner version and send a pull request if I am on the right track.

Thanks!

tedzhouhk commented 2 years ago

Thanks for your contribution! Unfortunately I'm a little busy recently and will look into it probably next month.

tedzhouhk commented 2 years ago

I have also added (#4) a script to be able to use any number of negative samples during inference.

moudheus commented 2 years ago

Thanks for the message, will look into it.