mtang724 / NWR-GAE

36 stars 8 forks source link

Issues with model.py? #4

Open m-schier opened 1 year ago

m-schier commented 1 year ago

Hello,

I have some questions regarding your model implementation from src/model.py. Starting off with the forward encoder:

    def forward_encoder(self, g, h):
        # K-layer Encoder
        # Apply graph convolution and activation, pair-norm to avoid trivial solution
        h0 = h
        l1 = self.graphconv1(g, h0)
        l1_norm = torch.relu(self.norm(l1))
        l2 = self.graphconv2(g, l1_norm)
        l2_norm = torch.relu(self.norm(l2))
        l3 = self.graphconv3(g, l2)
        l3_norm = torch.relu(l3)
        l4 = self.graphconv4(g, l1_norm) # 5 layers
        return l4, l3_norm, l2_norm, l1_norm, h0

This code seems to implement the forward encoding with fixed k=4. However I do not understand why l4 is not based on l3_norm but on l1_norm. Why does l3_norm not call self.norm, but just ReLU? Why is l3 calculated using l2, but not l2_norm?

Next, the function returns the tuple l4, l3_norm, l2_norm, l1_norm, h0, but it is assigned in forward() as:

gij, l2, l3, l1, h0 = self.forward_encoder(g, h)

Are l3 and l2 accidentally swapped?

Finally, self.layer1_generator is initialized but appears to be never used. Isn't the model lacking the neighborhood decoder for the first convolution then?

Thank you for considering my questions, Kind regards, Maximilian Schier

mtang724 commented 1 year ago

Hi Maximilian,

Yes, there are some mistakes that happen when I clean the code for GitHub Release. Thank you so much for pointing it out! (e.g., the l3 and l2 accidentally swapped since we swapped it in the encoder when the code is not for release)

The l4 is based on l1_norm is one of our experiment tricks, where we directly decode the representation from layer 4 to layer 1, in order to have faster training and coverage, which has also been proven effective and theoretically make sense in our case. Since we are not reconstructing the neighborhoods layer by layer but decoding the last layer node encoding to its neighborhood in different layers (as shown in our paper Pg. 5). And that says, self.layer1_generator is not necessary too, and that's also why the current code still has the State-of-the-arts results since decode to which layer (e.g., 4->1, or 4->3->2->1) doesn't make a huge difference according to our main idea and experiments. The previous one required longer training time and more epochs. But to be consistent with our paper's main procedure, I cleaned the code for the final release. I'll change them accordingly very soon.

Thank you so much, and hope I answered some of your questions. Let me know if you have further questions.

Best, Ming