Open Nick97Ohm opened 1 year ago
Hi! Actually sigmoid is a smooth continuous function, so it is expected that you get floating point numbers.
At train time you can use that floating value as the "mean" parameter of a Bernoulli distribution to maximize log likelihood (which is equivalent to what you are doing when maximizing the BinaryCrossentropy).
At evaluation time, you can either sample from the Bernoully distribution, or you can use the greedy a approach of just rounding to 0 or 1 e.g. sigmoid(output) > 0.5
will return True or False (and then you can cast to an integer to get 0 or 1).
Thanks a lot, then it was indeed a misunderstanding from my side. I thought sigmoid would act like a step function..,
This raises now another question:
I did change the value of "unlabeled" nodes to 0.5, because with -1 the nodes had an inclination to be classified with label 0. I thought that the value of 0.5 could be interpreted as "not sure if loyal to Node 0 or Node 33", but the results are still staying near the value of 0.5, where sometimes Nodes close to Node 0 are above 0.5 and Nodes close to Node 33 below.
Shouldn't in the first message passing layer at least the neighbors be immediately loyal to the labeled Nodes?
My guess is that, since the other neighbors are also labeled with 0.5 that this affects their results. But how can I fix that, if that's the case?
I really appreciate your help!
I am currently trying to learn how Graph Neural Networks work, but I am stuck for days with my understanding of this topic. Maybe someone of you can help me out.
I am using Zacharys Karate Club as graph dataset, where it is the goal to perform a node classification to determine which node (person) is loyal to which instructor ( Node 0 or Node 33).
For this purpose I am using the InteractionNetwork module with Linear modules for the node and edge updates. I did assume (and maybe this is where I misunderstood something GNNs) that if I put a sigmoid activation function after the node update, the nodes would have either 0 (loyal to Node 0) or 1 (loyal to Node 1) as values. But I get different double values.
Below is the code that I am using:
This is the output that I get:
Loss-funtion: Epoch 0 | Loss: 0.6619 Epoch 1 | Loss: 0.6547 Epoch 2 | Loss: 0.6478 Epoch 3 | Loss: 0.6412 Epoch 4 | Loss: 0.6351 Epoch 5 | Loss: 0.6292 Epoch 6 | Loss: 0.6233 Epoch 7 | Loss: 0.6172 Epoch 8 | Loss: 0.6110 Epoch 9 | Loss: 0.6048 Epoch 10 | Loss: 0.5988 Epoch 11 | Loss: 0.5931 Epoch 12 | Loss: 0.5877 Epoch 13 | Loss: 0.5826 Epoch 14 | Loss: 0.5777 Epoch 15 | Loss: 0.5728 Epoch 16 | Loss: 0.5680 Epoch 17 | Loss: 0.5633 Epoch 18 | Loss: 0.5589 Epoch 19 | Loss: 0.5549
Nodes:
I did not mention the edges, because I dont think that they are relevant for this issue and it would be too much information.