WIP: SequenceGenerator2

rizar commented 8 years ago

This first version of the code that works, see an adapted Markov chain example here.

Also see https://github.com/mila-udem/blocks/issues/988

[ ] make sure that all inputs and outputs of the recurrent brick are handled correctly
[ ] docstrings
[ ] new AttentionRecurrent
[ ] try using delegate where possible
[ ] return the score in generate

rizar commented 8 years ago

Heads-up: adapted reverse_words in my branch of Blocks-examples is already functional! In this process of making it work I have found that having a Fork in Feedback, even when it seems to be not necessary, can significantly speed up training. Namely, if was using SimpleRecurrent, which has only one input, so forking was technically redundant. But the bias vector, shared between the embedding seems to be helpful.

To prevent confusions, I made Fork mandatory for now. Making it optional later will not be difficult.

rizar commented 8 years ago

I have learnt that some people are already trying to use it. Given that this is Blocks-extras, I will merge this PR, hoping than we will finish it in the course of next months and move to Blocks.

mila-iqia / blocks-extras

WIP: SequenceGenerator2 #43