Closed kalyc closed 5 years ago
As per this issue - we will need to wait for MXNet v1.3 to release to be able to use the new API signature of mx.sym.embedding
for sparse. Will update this PR when the new PIP package for MXNet is available.
@kalyc - Can we move ahead with this, as we discussed to use mxnet --preview package?
Updated PR and tested with end-to-end imdb_lstm
model with sparse_grad
set to True
Removed embedding unit test as there is no data binded to the embedding symbol to test with.
Model -
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import LSTM
from keras.datasets import imdb
from keras import backend as K
max_features = 20000
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
print(K.backend())
# MXNet backend does not support dropout in LSTM and cannot automatically infer shape
if K.backend() == 'mxnet':
# specifying input_length and removed dropout params
model.add(Embedding(max_features, 128, input_length=maxlen, sparse_grad=True))
model.add(LSTM(128, unroll=True))
else:
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print('Train...')
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=1,
validation_data=(x_test, y_test))
score, acc = model.evaluate(x_test, y_test,
batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)
Result -
Using MXNet backend
Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
x_train shape: (25000, 80)
x_test shape: (25000, 80)
Build model...
mxnet
Train...
Train on 25000 samples, validate on 25000 samples
Epoch 1/1
/anaconda2/envs/mxnet/lib/python3.4/site-packages/mxnet/module/bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.03125). Is this intended?
force_init=force_init)
[14:57:19] src/operator/nn/../../common/utils.h:450: Optimizer with lazy_update = True detected. Be aware that lazy update with row_sparse gradient is different from standard update, and may lead to different empirical results. See https://mxnet.incubator.apache.org/api/python/optimization/optimization.html for more details.
25000/25000 [==============================] - 242s 10ms/step - loss: 0.4519 - acc: 0.7784 - val_loss: 0.3670 - val_acc: 0.8384
25000/25000 [==============================] - 60s 2ms/step
Test score: 0.36697145671844483
Test accuracy: 0.83836
Summary
Add minimal test for testing sparse embedding operator support
Related Issues
Missing Sparse operator support
PR Overview