Is the GRU implementation in consistent with the paper?

Maluuba / gensen

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Other

311 stars 43 forks source link

Is the GRU implementation in consistent with the paper? #5

Open my-yy opened 5 years ago

my-yy commented 5 years ago

I read the peephole GRU implementation in models.py :

newgate = F.tanh(i_n + resetgate h_n + p_n) （line54） hy = newgate + inputgate (hidden - newgate) . (line 55 )

Are they in consistent with the (3) and (4) equations in the paper? I think the line 54 missed the “entrywise product of r_t and h_t-1” and the line 55 also not looks like the equation (4).