pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.29k stars 3.65k forks source link

Add NDCG@K for link prediction performance evaluation #8271

Closed songsong0425 closed 8 months ago

songsong0425 commented 1 year ago

Dear PyG community,

Hi, always thank you for your effort in package maintenance and development. When I tried to implement the link prediction metrics (i.e., hits@k, ndcg@k), I couldn't find the example code for ndcg@k. Can you consider adding the example about the ndcg@k for the evaluation metrics? If it has already existed and I missed it, can anyone tell me where can I find it?

Thank you for reading!

_Originally posted by @songsong0425 in https://github.com/pyg-team/pytorch_geometric/discussions/7792_

SimonPop commented 1 year ago

Hi there!

I would gladly work on that subject :)

I have some questions with regard to the implementation details.

Location of the implementation

I noticed that the hits@k metric was implemented as part of the test method of the KGE base class: torch_geometric.nn.kge.base.

What would be the best approach to implement ndcg@k ?

  1. Add a method parameter (e.g. metric=ndcg / hits) to the test function to choose which to compute (hits@k or ndcg@k) and compute it inside this test method.
  2. Implement a separate method with a new name only for computing ndcg@k(e.g. test_ndgck).
  3. Compute both in the method (but the signature will change, which is a breaking change Tuple[float, float] -> Tuple[float, float, float, float]).
  4. Only implement it in an example but not in the library itself.
  5. Somewhere else?

Details of implementation

In its general form, if I understand it correctly, NDCG allows gains from different possible tails for a single head. For example, for a link prediction task, any of the neighbors (tails) of a given node (head) can bring a gain.

Currently, in the hits@k, we allow a single tail per head.

Do we want to let the user give a list of different tails (fixed number or flexible)? (again, a possible change in the signature of the method)

def test(
        self,
        head_index: Tensor,
        rel_type: Tensor,
        tail_indexes: List[Tensor],
        batch_size: int,
        k: int = 10,
        log: bool = True,
    )

Or do we want to keep the one head - one tail limit?

def test(
        self,
        head_index: Tensor,
        rel_type: Tensor,
        tail_index: Tensor,
        batch_size: int,
        k: int = 10,
        log: bool = True,
    )

Thanks! :)

rusty1s commented 1 year ago

I would prefer we implement these as metrics that follow the design of torchmetrics, such that we can compute them in a mini-batch fashion during training. This would then be something like

def update(top_k_pred_mat, edge_label_index):
    pass
SimonPop commented 12 months ago

Alright, I understand the plan for a torchmetric design, but I'm a bit lost with regard to the signature.

I understand that we would like a class that looks a bit like this:

class NdcgK():
    def __init__(self):
        self.dcg = 0
        self.idcg = 0

    def update(self, top_k_pred_mat: Tensor, edge_label_index: Tensor):
        pass

    def compute(self):
        return self.dcg / self.idcg

And what about top_k_pred_mat and edge_label_index? Are they respectively top_k_pred_mat the matrix of all possible pairs of nodes where only the top k predictions have been kept, and edge_label_index the valid node pairs / positive edges?

I am a bit confused as to the choice of these inputs. Why not leave the responsibility to the metric to select the top k values? Is it to be able to choose k dynamically?

And then, if it is a mini-batch, do we not want the batch_index as well to be able to reset the log factors? I think the torchmetrics implementation of NDCG does that. Actually, do we want to use that implementation and simply develop the link-prediction wrapper if that is for an example only purpose?

I may have misunderstood something along the way, sorry about that.

rusty1s commented 12 months ago

edge_label_index would be a [2, num_labels] matrix, so it basically comes paired with batch_index. Let me know if I am missing something

k should definitely be an attribute during initialization of the metric, and so top_k_pred_mat would just require size(1) >= k.

SimonPop commented 12 months ago

Yes! I think you are right about the batch_index not being needed.

I have tried to create a first implementation in a new branch so that I can project myself more easily. We can also discuss on that basis for the following updates.