In the "Translation with a Sequence to Sequence Network and Attention" document,
in this paragraph:
"Teacher Forcing", or maximum likelihood sampling, means using the real target outputs as each next input when training. The alternative is using the decoder's own guess as the next input. Using teacher forcing may cause the network to converge faster, but when the trained network is exploited, it may exhibit instability.
the reference is broken. I believe it should point to this: ESNTutorialRev
In the "Translation with a Sequence to Sequence Network and Attention" document, in this paragraph:
the reference is broken. I believe it should point to this: ESNTutorialRev