CSTR-Edinburgh / merlin

This is now the official location of the Merlin project.
http://www.cstr.ed.ac.uk/projects/merlin/
Apache License 2.0
1.31k stars 440 forks source link

A bug in Merlin's RNN implementation. #383

Open lomizandtyd opened 6 years ago

lomizandtyd commented 6 years ago

Merlin has used incorrectly theano.scan function in gating.py. The bug has been confirmed by my local code.

Theano's scan function will loop through the first dimension. While the batch input has 3 dimensions (num_batches, num_timesteps, num_dimensions). A dimension transposing operation should be applied before scan to make the batch input has a (num_timesteps, num_batches, num_dimensions) shape. After scan function, you should transpose them back.

Attention, I only checked gating.py file.