pixelhero98 / MGDPR

11 stars 5 forks source link

Bug in class Multi_GDNN #3

Closed kwan-0915 closed 6 months ago

kwan-0915 commented 6 months ago

Hello. On line 41 (see Multi_GDNN.py), h and u both are returned by diffusion_layers, and if we dive into diffusion_layers that is the MGD model.

We will see h is a tensor that after Conv2D, and u is a tensor that store the result from s@x.

Hence, here h represent hl and we need this result and pass it to equation 4 to compute h'l

However, on line 44 (see Multi_GDNN.py), you pass u to the retention_layers instead of h, which is contradicted with equation 4 in the paper.

It seems there exist many fatal error in the code base, I wonder if the result reported in the paper is reliable and correct.

Please have a look on the entire code, otherwise the result is not only unable to reproduce, but also exist a case that this is cheating.

Thank you.

kwan-0915 commented 6 months ago

There are another bug in Multi_GDNN.py. Both line 48 and line 50, h_prime is incorrect or contradicted with equation 4.

After concating the parallel retention with the hidden graph representation, there should be a linear layer that takes this result as input. The output of this linear layer should be activated by an activation function. However, it seems missing in both line 48 and line 50.

pixelhero98 commented 6 months ago

Hi,

Thank you for reaching out and for your understanding.

I have corrected all the errors that might result unsuccessful test of this older version (e.g., can easily tested with toy data, X, A, C = torch.randn(num_relation, n, time_steps), torch.randn(num_relation, n, n), torch.ones(n)), num_relation=5 and n=10). It should be functional now.

Second, I am on annual leave and do not have access to my machines and I'm not planning to do further work now. As I am the only one for the code development in the group, I have too many projects stored locally, and the naming is messy. Thus, sometimes the synchronized files are not the expected ones. I found it very difficult to manage the git locally and the website, therefore I will have to rearrange these codes when I get sufficient spare time.

Last, I understand the need for prompt responses, but people have their issues and affairs to deal with. Many authors won't reply or reply once a year, though there are already many issues over there. I seldom check GitHub, typically several times a year. Personally, it might be more efficient to seek assistance from colleagues who are familiar with and available to offer immediate support and advice, instead of keeping pushing a stranger. I wish you a successful continuation of your academic endeavors.

pixelhero98 commented 6 months ago

Here is an example of usage:

import torch import torch.nn.functional as F import torch.distributions from Multi_GDNN import MGDPR

Configure the device for running the model on GPU or CPU

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Define model (these can be tuned)

n = 10 # toy usage

d_layers, num_nodes, time_steps, num_relation, gamma, diffusion_steps = 6, n, 21, 5, 2.5e-4, 7

diffusion_layers = [time_steps, 3 time_steps, 4 time_steps, 5 time_steps, 5 time_steps, 6 time_steps, 5 time_steps]

retention_layers = [num_relation3n, num_relation5n, num_relation4n, num_relation4n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation6n, num_relation5n, num_relation5n, num_relation5n, num_relation5n, num_relation5n]

ret_linear_layers_1 = [time_steps num_relation, time_steps num_relation, time_steps num_relation 5, time_steps num_relation, time_steps num_relation 6, time_steps num_relation, time_steps num_relation 6, time_steps num_relation, time_steps num_relation 6, time_steps num_relation, time_steps num_relation 6, time_steps * num_relation]

ret_linear_layers_2 = [time_steps num_relation 5, time_steps num_relation 5, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6, time_steps num_relation 6]

mlp_layers = [num_relation 5 time_steps + time_steps * num_relation, 128, 2]

X, A, C = torch.randn(num_relation, n, time_steps), torch.randn(num_relation, n, n), torch.ones(n)

Define model

model = MGDPR(diffusion_layers, retention_layers, ret_linear_layers_1, ret_linear_layers_2, mlp_layers, d_layers, num_nodes, time_steps, num_relation, gamma, diffusion_steps)

Pass model and datasets to GPU

model = model.to(device)

Define optimizer and objective function

def theta_regularizer(theta): row_sums = torch.sum(theta, dim=-1) ones = torch.ones_like(row_sums) return torch.sum(torch.abs(row_sums - ones))

def D_gamma_regularizer(D_gamma):

#upper_tri = torch.triu(D_gamma, diagonal=1)
#return torch.sum(torch.abs(upper_tri))

optimizer = torch.optim.Adam(model.parameters(), lr=3e-3)

Define training process & validation process & testing process

epochs = 10000 model.reset_parameters()

Training

for epoch in range(epochs): model.train()

objective_total = 0
acc = 0

X = X.to(device)  # node feature tensor
A = A.to(device)  # adjacency tensor
C = C.long()
C = C.to(device)  # label vector

objective = F.cross_entropy(model(X, A), C)
objective_total += objective

objective_average = objective_total / len([1]) + 0.01 * theta_regularizer(model.theta)
objective_average.backward()
optimizer.step()
optimizer.zero_grad()
kwan-0915 commented 6 months ago

Hi thank you for the quick reply. I agree with you that not every author will reply on GitHub, most of them just ignore the issues ticket, and you have been extremely responsive and providing a lot of help. I am always grateful for your help.

I am trying to follow your project and in fact I think I will be able to come up with a runnable example.

I will raise a pull request later once I have my code ready.

Once again, you are a very good author and no matter if I can reproduce the result or not, I have learnt a lot from your paper.

Thank you, I will close out all the issues I opened.

kwan-0915 commented 6 months ago

This issue have been solved and replied by the author with clear explanation