Are uw and us global weights? just to conform.

Thank you ematvey for this paper.

I wonder the uw and us are two vectors as global weights, or there are different uw(s) for each sentence, and different us(s) for each document?

From the code I think these are global vectors, am I right? Please help me confirm this.

As in the model_components.py it is said

Performs task-specific attention reduction, using learned attention context vector (constant within task of interest).

The uw or us are defined in the function task_specific_attention(), although they are both referred to the attention_context_vector, but in the computational graph, are they different vectors? It would be helpful if you could explain a little about this part.

attention_context_vector = tf.get_variable(name='attention_context_vector', shape=[output_size], initializer=initializer, dtype=tf.float32)

Thank you.

ematvey / hierarchical-attention-networks

Are uw and us global weights? just to conform. #18