Autoregressive decoders refactor

varisd commented 5 years ago

Refactor and code cleanup of the autoregressive.py, decoder.py and decoder/transformer.py

DecoderFeedables and DecoderHistories now contain attribute common for all derived decoders and an attribute other which is internal to a specific autoregressive implementation. When working with Autoregressive in general, the other should not be visible outside of the derived decoder.

There is also a clear distinction between DecoderFeedables and DecoderHistories. Only DecoderFeedables should be used when generating next output state. DecoderHistories should only record the decoding process history and should never be used to provide input for the next decoding step. DecoderFeedables attributes are batch-major. DecoderHistories attributes are time-major. This is required for compatibility with current BeamSearchDecoder.

Also, the decoding loop was restructured to contain less repeating code. The body method is no longer abstract and contains the code common for all derived decoders, which follows:

call next_state(loop_state) - method which is specific for each derived decoder and should only output pre-logit output-state, decoder-specific feedables and histories (for the .other attributes).
then the output of the next_state is transformed into logits and next symbols input symbols are produced (this step should be independent of the derived decoder implementation)
finally, new next feedables and histories are prepared (note that we feed the embedded input symbols instead of their vocabulary ids)

jindrahelcl commented 5 years ago

Když by to nebylo dohromady s tím batchováním, tak bych to mergoval. S tím batchováním je asi potřeba se domluvit s Jindrou, jak a kde to koliduje s jeho refaktorem.

Tohle mi teď přijde, že je asi dotáhnutější kus práce, takže bych to řešil dřív. Podívám se na to.

jlibovicky commented 5 years ago

Rebasováno na masteru. Detailněji to prodju zítra.

varisd commented 5 years ago

Rebasováno na masteru. Detailněji to prodju zítra.

Kdyztak ten rebase prosim pushni do nejake nove vetve (treba .rebased). A tu starou presun, ale at je nekde schovana pro potreby revertu

jlibovicky commented 5 years ago

presun

Pozdě, už jsem tu větev přepsal. Jestli máš někde ještě lokálně to původní, tak jí můžeš někam pushnout. Ale stejně na té původní větvi sedí ještě ten další pull request, ne?

varisd commented 5 years ago

Jeste jsem faktorizoval obsah get_initial_loop_state do podmetod.

varisd commented 5 years ago

Co tady chybi? Jen opravit konfiguraky, aby prosly testy?

jlibovicky commented 5 years ago

Pokud k tomu @jindrahelcl nic nemá, tak jo.

jindrahelcl commented 5 years ago

já mam dovolenou. chtěl jsem se na to ještě podívat, protože je to docela velkej pull request. Jestli na to nějak spěcháte, tak to zamergujte, po zběžným kouknutí večer v hotelu jsem si nevšimnul ničeho závažnýho.

jlibovicky commented 5 years ago

Ještě rybíz.

ufal / neuralmonkey

Autoregressive decoders refactor #796