Suggestion: Attribute Prediction Example

RandomString123 commented 4 years ago

I am new to graph networks and I have reviewed, and am using, this project as a springboard to learn more. I understand the examples provided with this project and thought I could extend those examples to perform what most graph network papers describe as a basic task, predicting {Node, Edge, Global} attributes of a graph based on graph structures and relationships learned from training graphs.

However, every model I seem to apply does not produce the expected target graph it produces some intermediate graph that I can not figure out how to decode back into my original graph with the missing attributes filled in. I think it would help if there was an example project that performs {Node, Edge, Global} attribute prediction of a graph based training performed on similar graphs? Even if this example just trains on random graphs / data. I am more interested in structuring a network to:

With a GraphTuple or graph dictionary properly ablate out values
Train a GraphNetwork on un-ablated graphs (or sample ablated with the associated un-ablated target graphs)
Use the GraphNetwork on unseen ablated graphs to compute accuracy of the model.

alvarosg commented 4 years ago

I am not sure what the problem is exactly, could you provide a minimum working example?

In principle graph networks always take graphs as inputs and output full graphs with the same structure, and a configurable number of features per-edge, per-node, and per-global. So in principle all you need is to match the output sizes to have the same size as the inputs, and build a loss to minimize the difference between the features of the output graphs (produced by feeding the ablated input graphs to the graph net) and the un-ablated training graphs.

RandomString123 commented 4 years ago

In the next comment is a file that is a toy example I created. Imagine you wanted to represent all of the possible functions x + y = z as a a graph where x = {1,2,3}, y = {4,5,6}. Then over-fit a model so that if it sees the graph 1 + 5 = ? it will return the completed graph 1 + 5 = 6.

From your above message and my thoughts over the day I think there are three problems here and they all probably stem from my unfamiliarity with tf and graph networks.

1) I have too many free variables and they are unbounded so the algorithm just randomly bounces around. 2) I am probably using the wrong model. 3) My loss function is probably sub-optimal.

RandomString123 commented 4 years ago

I did some more testing over the weekend and found a bug in my loss model. Put a better version below. It still does much worse than I would expect for the reasons above. The linear model still doesn't do very well I think i should be using another when ablating out a nodes attributes and trying to predict its value based on the rest of the graph.

test.txt

alvarosg commented 4 years ago

Hi, there seems to be several things that could be fixed in that code (I tried fixing them an the model trains in 100 iterations):

For an operation x + y = z, currently you are creating a graph: (x) --> (y) --> (z). Since you are ablating the z value, and this needs to be conditioned on the x and y value, you need at least two steps of message passing (for example, two graph networks connected in series), for information to reach from x to z. Alternatively, you could connect your graph as (x) --> (z) <-- (y) (e.g. "receivers": [2, 2], in your code). Be ware, that in any case, if you try to ablate x or y, this will still not work, because the information is only sent in the direction of the edges, so if you want to ablate x or z, you will also need to add backward edges (and probably add a feature to distinguish them from the forward edges).
Currently ablating means setting a value of 0 in the node, in this case this works because the correct result is never 0, but beware that this may stop working if 0 can be a correct result.
edge_model_fn, node_model_fn and global_model_fn are snt.Linear, so the model will have very little capacity (Although in this case it is enough to solve the problem if you connect the graph as I indicated above). You should probably use snt.nets.MLP instead
The edge features are [1] and [2] respectively. You don't seem to be ablating those, but you are still adding an edge component to the loss. This should not be a problem, except because the edges are used to send information between the nodes, but because you are also regressing the edges, you are forcing the edge_model_fn, to have an output size of 1, this means the model only has a single edge feature, that has to output to either 1 or 2, to output the correct edge values, so it cannot really use the edges to send information between nodes. In this case the easiest solution would be to remove the edges from the loss, and increase OUTPUT_EDGE_SIZE. If you actually want to be able to make edge predictions of size one too, the usual alternative is to still have a larger OUTPUT_EDGE_SIZE to produce an output embedding for each edge, and then use an additional linear later only on output_graph.edges with an output size of 1. In practice the best option is to use a final modules.GraphIndependent to decode nodes, edges and globals independently.

Hope this helps!

RandomString123 commented 4 years ago

Those suggestions got me a lot closer to a working solution. Thanks, it is a good starting point for toying around with the networks and seeing how they operate.

google-deepmind / graph_nets

Suggestion: Attribute Prediction Example #90