Choco31415 / Attention_Network_With_Keras

An example attention network with simple dataset.
230 stars 101 forks source link

What is the role of Dense(x,yt-1)? #3

Closed binaryOmaire closed 3 years ago

binaryOmaire commented 5 years ago

What are x,y,t-1 in Dense(x,yt-1)?

Choco31415 commented 5 years ago

The goal of an attention layer is to select the important parts of the input to consider for generating output. The input data itself impacts its importance, hence it is input to calculate context.

Additionally, the network's context should vary over time. Only inputting x means the context will be static. As such, this tutorial has a RNN layer on top of the attention mechanism, the output of which is mildly confusingly called y. y(t-1) is the previous RNN output, and having this input for context provides a time varying mechanism for the AI to shift it's attention over time.

Does this answer the question?