Masking Layer doesn't work after adding a NaiveRunGraph feature in MXNet

karan6181 commented 5 years ago

Masking Layer fails with MXNet backend after this PR #14192 got merged in MXNet master.
Couple of RNN test fails such as test_masking_correctness(), test_masking_layer() in ./keras-apache-mxnet/tests/keras/layers/recurrent_test.py

Thank you!

[x] Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/awslabs/keras-apache-mxnet.git --upgrade --no-deps
[x] If running on MXNet, check that you are up-to-date with the latest version. The installation instructions can be found here
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Below is the minimum reproducible code:

import numpy as np
from keras.layers import LSTM
from keras.layers import Embedding
from keras.models import Sequential

num_samples = 2
timesteps = 5
embedding_dim = 4
units = 3
embedding_num = 12

model = Sequential()
model.add(Embedding(embedding_num, embedding_dim,
                               mask_zero=True,
                               input_length=timesteps
                               ))

# layer = recurrent.SimpleRNN(units)
layer = LSTM(units)
model.add(layer)
model.compile(optimizer='sgd', loss='mse')

left_padded_input = np.ones((num_samples, timesteps))
left_padded_input[0, :1] = 0
left_padded_input[1, :2] = 0
out6 = model.predict(left_padded_input)

roywei commented 5 years ago

I think it triggers navie run graph only if masking enabled + sym.foreach operator used. Which means RNN layer with unroll=False does not work with masking layer.
Current workaround to enable masking: use unroll=True in RNN layer

karan6181 commented 5 years ago

Yes absolutely correct. If we add unroll=True in RNN/LSTM/GRU layer, it uses the static forward and works without any issue.

Below is the running code where I have added unroll=True:

import numpy as np
from keras.layers import LSTM
from keras.layers import Embedding
from keras.models import Sequential

num_samples = 2
timesteps = 5
embedding_dim = 4
units = 3
embedding_num = 12

model = Sequential()
model.add(Embedding(embedding_num, embedding_dim,
                               mask_zero=True,
                               input_length=timesteps
                               ))

# layer = recurrent.SimpleRNN(units)
layer = LSTM(units, unroll=True)
model.add(layer)
model.compile(optimizer='sgd', loss='mse')

left_padded_input = np.ones((num_samples, timesteps))
left_padded_input[0, :1] = 0
left_padded_input[1, :2] = 0
out6 = model.predict(left_padded_input)

awslabs / keras-apache-mxnet

Masking Layer doesn't work after adding a NaiveRunGraph feature in MXNet #228