visualisation of transformer attention

Hi, When trying to access the transformers internal outputs, in an attempt to visualize attention, I am unable to access these layers on the normal tensorflow way using e.g. "model.get_layer('transformer').get_layer('decoder').get_layer('layer_5').multihead_attn.output". It seems the graph is not fully connected as it should be. I have found a couple of fixes that allowed me to access these layers succesfully:

define pos_encoding with boolean version of your masks as it is now (masks = tf.zeros((tf.shape(x)[0], tf.shape(x)[1], tf.shape(x)[2]), tf.bool); pos_encoding = self.pos_encoder(masks); but pass the masks to the transformer as tf.float32 instead of tf.bool.
pass the query to he transformer as self.query_embed(tf.constant(1.0)) instead of self.query_embed(None); in the FixedEmbedding Layer, return tf.multiply(self.w, x) instead of returning self.w; passing 'None' as argument causes the graph to break.
in the transformer code, make sure to put 'memory' as the first argument in the call to the decoder; put 'target' as the second argument: inputs to call() should always be the first argument; when you put 'memory' as the second argument, the graph breaks apart; in case of multiple inputs, it's even better to put them in a list. Applying these changes to the code allowed me to access the multihead attention output using model.get_layer('transformer').get_layer('decoder').get_layer('layer_5').multihead_attn.output

Visual-Behavior / detr-tensorflow

visualisation of transformer attention #38