Any function to perform weighted average as a reducer?

JoaoLages commented 4 years ago

By default, the network uses unsorted_segment_sum in node_block_opt and global_block_opt. Is there an implementation to have a reducer learned with weights that performs weighted average? (or any other learned aggregation) I've tried to implement it but it is too hard to replace unsorted_segment_sum

alvarosg commented 4 years ago

Thanks for your message.

There exists tf.unsorted_segment_mean which you can use to perform a standard average. If you want a weight average you could do write it in terms of sums like:

def unsorted_segment_weighted_average(data, weights, segment_ids, num_segments):
  epsilon=0.000001
  weighted_sum = tf.unsorted_segment_sum(data*weights, segment_ids, num_segments)
  sum_weights = tf.unsorted_segment_sum(weights, segment_ids, num_segments)
  weighted_average = weighted_sum/(sum_weights + epsilon)
  return weighted_average

(where the epsilon is meant to avoid 0/0 divisions in case nodes don't have any received edges, or graphs don't have any nodes, you could also use tf.where instead).

Finally, if you want to be able to use that reducer you probably need define a closure so the signature of the reducer is what the functions expect:

weights = # wherwever you get the weights from
def reducer(data, segment_ids, num_segments):
  return unsorted_segment_weighted_average(data, weights, segment_ids, num_segments):
modules.GraphNetwork.__init__(..., reducer=reducer)

I am not sure how you would learn a variable with the weights for the weighted average though. The unsorted_segment_sum/unsorted_segment_mean are performed along the node/edge axis and the number and order of nodes/edges may change which means the size of the variable with the weights would have to change, and the network would stop being permutation equivariant.

Hope this helps!

JoaoLages commented 4 years ago

Thanks for your comment! So, my idea is to have a single dense layer that has as input 'node embedding size' and output 1 - with the same reasoning for edges

This way, that Dense layer would learn to assign a weight to an embedding, in order to perform weighted average - all weights obtained are normalized via softmax. Having this setup, the reducer is also invariant to the number of nodes/edges of the graph, to perform node/edge aggregation.

Do you see a way for us to add a Dense layer to the code you posted above?

alvarosg commented 4 years ago

That can be somehow similar to how SelfAttention in modules.py works. There is even a non-exposed implementation of _unsorted_segement_softmax that may be useful.

Hope this helps!

JoaoLages commented 4 years ago

Thank you. I've looked at the SelfAttention module, but since it doesnt use globals, it is different from what I want to do. I really just wanted to replace the reducer function. will look into the softmax function to see if it helps

alvarosg commented 4 years ago

(Closing for now)

google-deepmind / graph_nets

Any function to perform weighted average as a reducer? #91