The proj[0] only records the decoder hidden state in time 0 (note in gru_cond_simple_layer, you are returning rval not [rval]). Once the shape of logit_lstm and logit_prev are output, you will find them inconsistent.
For session 2 with attention, there is no problem.
In session 1, nmt.py(https://github.com/kyunghyuncho/dl4mt-material/blob/master/session1/nmt.py) Line 529, I'm wondering whether 'proj_h = proj[0]' should be replaced as 'proj_h = proj'.
The proj[0] only records the decoder hidden state in time 0 (note in gru_cond_simple_layer, you are returning rval not [rval]). Once the shape of logit_lstm and logit_prev are output, you will find them inconsistent.
For session 2 with attention, there is no problem.