neulab / xnmt

eXtensible Neural Machine Translation
Other
185 stars 44 forks source link

Stative Attenders #566

Open armatthews opened 5 years ago

armatthews commented 5 years ago

Right now all the attenders are state-free, but there exist some attenders that require some concept of state. The most obvious one are things that involve coverage.

I propose that we add an AttenderState, have Attender.init_sent() return an initial state, and then have Attender.calc_attention() (and its cousin Attender.calc_context()) take in an AttenderState and return a new (i.e. updated) AttenderState.

Thoughts?

philip30 commented 5 years ago

I think this can be merged with Decoder state since only the decoder uses the attender.

On Sat, Feb 9, 2019 at 3:41 PM Austin Matthews notifications@github.com wrote:

Right now all the attenders are state-free, but there exist some attenders that require some concept of state. The most obvious one are things that involve coverage https://arxiv.org/pdf/1601.04811.pdf.

I propose that we add an AttenderState, have Attender.init_sent() return an initial state, and then have Attender.calc_attention() (and its cousin Attender.calc_context()) take in an AttenderState and return a new (i.e. updated) AttenderState.

Thoughts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/neulab/xnmt/issues/566, or mute the thread https://github.com/notifications/unsubscribe-auth/AE_GbI9cRyYCr2XQShWeh3l_TNlk2hkDks5vLlGBgaJpZM4ayYKl .

-- Philip

armatthews commented 5 years ago

Hmm, that's a good point.

We still would need some sort of mechanism to have the Attender give some sort of initial state, and then some mechanism to get the next state after adding some input.

Another consideration is that right now the Attender is a direct descendant of the Model, and not contained within the Decoder. That makes it hard, for example, for decoder.init_sent() to call attender.init_sent(). Perhaps we should move the attender to within the decoder?

armatthews commented 5 years ago

Fixed in this PR.