nyu-dl / dl4mt-tutorial

BSD 3-Clause "New" or "Revised" License
618 stars 249 forks source link

On decoder to softmax layer projection in session 1 #43

Closed ustctf-zz closed 8 years ago

ustctf-zz commented 8 years ago

In session 1, nmt.py(https://github.com/kyunghyuncho/dl4mt-material/blob/master/session1/nmt.py) Line 529, I'm wondering whether 'proj_h = proj[0]' should be replaced as 'proj_h = proj'.

The proj[0] only records the decoder hidden state in time 0 (note in gru_cond_simple_layer, you are returning rval not [rval]). Once the shape of logit_lstm and logit_prev are output, you will find them inconsistent.

For session 2 with attention, there is no problem.

scfrank commented 8 years ago

I can confirm this bug and the fix.

orhanf commented 8 years ago

Thank you @ustctf for pointing out the bug and thank you @scfrank for the fix