awslabs / dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
https://dglke.dgl.ai/doc/
Apache License 2.0
1.28k stars 196 forks source link

[Question] --has_edge_importance #223

Closed ccvalley closed 3 years ago

ccvalley commented 3 years ago

When using the --has_edge_importance argument in the dglke_train function, is a higher edge importance score weighted as more important than a lower edge importance score? Or vice versa?

Based on the get_total_loss function, it seems that a lower edge weight would be more favorable in the loss calculation. https://github.com/awslabs/dgl-ke/blob/b4e57016d5715429377d5aab79e88c451dc543f5/python/dglke/models/pytorch/loss.py#L69-L77

Thank you.

    def get_total_loss(self, pos_score, neg_score, edge_weight=None):
        log = {}
        if edge_weight is None:
            edge_weight = 1
        if self.pairwise:
            pos_score = pos_score.unsqueeze(-1)
            loss = th.mean(self.loss_criterion((pos_score - neg_score), 1) * edge_weight)
            log['loss'] = get_scalar(loss)
            return loss, log
classicsong commented 3 years ago

The higher edge_weight, it means it will contribute more to the loss. Usually you can assign higher weight to low frequency relations, to handle the data imbalance problem.