IndicoDataSolutions / Passage

A little library for text analysis with RNNs.
MIT License
530 stars 134 forks source link

LSTM example rise a `dtype` difference error #34

Open sjhddh opened 9 years ago

sjhddh commented 9 years ago

Hello, I just used exactly the same example in Passage/mnist.py The only modification is to change GatedRecurrent into LstmRecurrent:

import ...
...

trX, teX, trY, teY = load_mnist()

#Use generic layer - RNN processes a size 28 vector at a time scanning from left to right
layers = [
    Generic(size=28),
    LstmRecurrent(size=512, p_drop=0.2),
    Dense(size=10, activation='softmax', p_drop=0.5)
]

#A bit of l2 helps with generalization, higher momentum helps convergence
updater = NAG(momentum=0.95, regularizer=Regularizer(l2=1e-4))

#Linear iterator for real valued data, cce cost for softmax
model = RNN(layers=layers, updater=updater, iterator='linear', cost='cce')
model.fit(trX, trY, n_epochs=20)

tr_preds = model.predict(trX[:len(teY)])
te_preds = model.predict(teX)

tr_acc = np.mean(trY[:len(teY)] == np.argmax(tr_preds, axis=1))
te_acc = np.mean(teY == np.argmax(te_preds, axis=1))

# Test accuracy should be between 98.9% and 99.3%
print 'train accuracy', tr_acc, 'test accuracy', te_acc

However, there arose an error:

Traceback (most recent call last):
  File "/.../ex2.py", line 24, in <module>
    model = RNN(layers=layers, updater=updater, iterator='linear', cost='cce')
  File "/.../models.py", line 44, in __init__
    self.y_tr = self.layers[-1].output(dropout_active=True)
  File "/.../layers.py", line 297, in output
    X = self.l_in.output(dropout_active=dropout_active)
  File "/.../layers.py", line 190, in output
    truncate_gradient=self.truncate_gradient
  File "/.../theano/scan_module/scan.py", line 1042, in scan
    scan_outs = local_op(*scan_inputs)
  File "/.../theano/gof/op.py", line 507, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/.../theano/scan_module/scan_op.py", line 374, in make_node
    inner_sitsot_out.type.dtype))
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (`outputs_info` in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 4) has dtype float32, while the result of the inner function (`fn`) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.

How could I fix this? or is there anything I can do to make the program run smoothly?

naeemulhassan commented 9 years ago

I was also having the same problem. Running like below worked for me.

THEANO_FLAGS='floatX=float32'  python myprogram.py
madisonmay commented 9 years ago

You can also configure these settings in your ~/.theanorc file.

[global]
floatX = float32