DaehanKim / vgae_pytorch

This repository implements variational graph auto encoder by Thomas Kipf.
MIT License
385 stars 73 forks source link

About normalization constants norm and weight_tensor #3

Closed psanch21 closed 3 years ago

psanch21 commented 3 years ago

Hi,

I am having some trouble understanding why are we using the normalization constants norm and weight_tensor and why they are defined in this way. Could you provide some intuition behind this normalization?

norm = adj.shape[0] * adj.shape[0] / float((adj.shape[0] * adj.shape[0] - adj.sum()) * 2)
pos_weight = float(adj.shape[0] * adj.shape[0] - adj.sum()) / adj.sum()
weight_mask = adj_label.to_dense().view(-1) == 1
weight_tensor = torch.ones(weight_mask.size(0)) 
weight_tensor[weight_mask] = pos_weight

Thanks a lot for your help beforehand!

DaehanKim commented 3 years ago

Hi! @psanch21 . Thank you for a good question.

pos_weight is intended to balance effective number of positive edges to negative edges, since negative edges usually outnumber positive edges. For example, suppose the number of negative edges is 12 in a 4 by 4 adjacency matrix (with no self-loops), and the number of positive edge is 4. In this case, we want to balance positive edges to negative edges so that they become equally effective to the model. We can simply multiply (# neg edges) / (# pos edges) to the loss of a positive sample so that the effective number of samples becomes (# neg edges) for both negative and positive edges. (Now we have a loss weight 1 for negative edges and 3(=12/4) for a positive edges so the effective numbers of both edges are the same.) but log likelihood needs correction because we now practically have a loss for 2 times (# neg edges). Thus we normalize cross entropy by multiplying (# total edges) / [ 2 * (# neg edges) ]. (We have a loss for 24 samples but we need a loss for 16 samples. so we multiply 24/16 to the weighted loss!)

Hope this helps!

psanch21 commented 3 years ago

Hi @DaehanKim,

It does help! Thanks a lot!