Upcast to `float64` when using any RNN

While debugging huge memory consumption issues, I stumbled that using any RNN with return_sequences=False (eg. SimpleRNN, GRU) results in an upcast to float64. Consequently everything that follows is also float64 and memory consumption is increased. As stated in Theano documentation any operation between float32 or int32/int64 results in float64.

Solution would probably be to cast all integers to float32 to prevent unnecessary memory usage?

# Problem example that casts to float64
#   THEANO_FLAGS='floatX=float32,warn_float64=raise' python foo.py

import numpy as np
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense

data_dim = 16
timesteps = 8
nb_classes = 10
batch_size = 32

model = Sequential()
model.add(SimpleRNN(nb_classes, return_sequences=False, batch_input_shape=(batch_size, timesteps, data_dim)))
#model.add(Dense(nb_classes, batch_input_shape=(batch_size, data_dim)))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

# generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim)).astype(np.float32)
#x_train = np.random.random((batch_size * 10, data_dim)).astype(np.float32)
y_train = np.random.random((batch_size * 10, nb_classes)).astype(np.float32)

model.fit(x_train, y_train, batch_size=batch_size, nb_epoch=5)

Triggering the issue by raising Theano warnings for float64 (latest Keras https://github.com/fchollet/keras/commit/b587aeee1c1be3633a56b945af3e7c2c303369ca, Theano 0.8.1):

$ THEANO_FLAGS='floatX=float32,warn_float64=raise' python foo.py 
Using Theano backend.
ERROR (theano.gof.opt): Optimization failure due to: local_opt_alloc
ERROR (theano.gof.opt): node: Sum{axis=[1], acc_dtype=float64}(Alloc.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
  File "/home/venv/local/lib/python2.7/site-packages/theano/gof/opt.py", line 1772, in process_node
    replacements = lopt.transform(node)
  File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/opt.py", line 5137, in local_opt_alloc
    val *= T.mul(*to_prod)
  File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 240, in __rmul__
    return theano.tensor.basic.mul(other, self)
  File "/home/venv/local/lib/python2.7/site-packages/theano/gof/op.py", line 611, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 597, in make_node
    out_broadcastables)]
  File "/home/venv/local/lib/python2.7/site-packages/theano/gof/type.py", line 400, in __call__
    return utils.add_tag_trace(self.make_variable(name))
  File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/type.py", line 431, in make_variable
    return self.Variable(self, name=name)
  File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 762, in __init__
    raise Exception(msg)
Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.

ERROR (theano.gof.opt): Optimization failure due to: local_opt_alloc
ERROR (theano.gof.opt): node: Sum{axis=[1], acc_dtype=float64}(Alloc.0)
...
Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.

Epoch 1/5
320/320 [==============================] - 0s - loss: 25.0333     
Epoch 2/5
320/320 [==============================] - 0s - loss: 23.8539     
Epoch 3/5
320/320 [==============================] - 0s - loss: 23.5233     
Epoch 4/5
320/320 [==============================] - 0s - loss: 23.4157     
Epoch 5/5
320/320 [==============================] - 0s - loss: 23.3869
$

It seems there is some multiplication between a constant float32 and int64 happening:

(Pdb) self
<theano.tensor.elemwise.Elemwise object at 0x7faefc245a90>
(Pdb) inputs
[TensorConstant{0.0}, Elemwise{mul,no_inplace}.0]
(Pdb) inputs[0].type
TensorType(float32, scalar)
(Pdb) inputs[1].type
TensorType(int64, scalar)

I think this was fixed in Theano 0.8.2. Le 6 avr. 2016 08:43, "gw0" notifications@github.com a écrit :

While debugging huge memory consumption issues, I stumbled that using any RNN with return_sequences=False (eg. SimpleRNN, GRU) results in an upcast to float64. Consequently everything that follows is also float64 and memory consumption is increased. As stated in Theano documentation http://deeplearning.net/software/theano/faq.html#float32-int-32-64-gives-float64 any operation between float32 or int32/int64 results in float64.

Solution would probably be to cast all integers to float32 to prevent unnecessary memory usage?

Problem example that casts to float64# THEANO_FLAGS='floatX=float32,warn_float64=raise' python foo.py

import numpy as npfrom keras.models import Sequentialfrom keras.layers import SimpleRNN, Dense

data_dim = 16 timesteps = 8 nb_classes = 10 batch_size = 32

model = Sequential() model.add(SimpleRNN(nb_classes, return_sequences=False, batch_input_shape=(batch_size, timesteps, data_dim)))#model.add(Dense(nb_classes, batch_input_shape=(batch_size, data_dim))) model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

generate dummy training data

x_train = np.random.random((batch_size * 10, timesteps, data_dim)).astype(np.float32)#x_train = np.random.random((batch_size * 10, data_dim)).astype(np.float32) y_train = np.random.random((batch_size * 10, nb_classes)).astype(np.float32)

model.fit(x_train, y_train, batch_size=batch_size, nb_epoch=5)

Triggering the issue by raising Theano warnings for float64 (latest Keras b587aee https://github.com/fchollet/keras/commit/b587aeee1c1be3633a56b945af3e7c2c303369ca, Theano 0.8.1):

$ THEANO_FLAGS='floatX=float32,warn_float64=raise' python foo.py Using Theano backend. ERROR (theano.gof.opt): Optimization failure due to: local_opt_alloc ERROR (theano.gof.opt): node: Sum{axis=[1], acc_dtype=float64}(Alloc.0) ERROR (theano.gof.opt): TRACEBACK: ERROR (theano.gof.opt): Traceback (most recent call last): File "/home/venv/local/lib/python2.7/site-packages/theano/gof/opt.py", line 1772, in process_node replacements = lopt.transform(node) File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/opt.py", line 5137, in local_optalloc val = T.mul(_to_prod) File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 240, in rmul return theano.tensor.basic.mul(other, self) File "/home/venv/local/lib/python2.7/site-packages/theano/gof/op.py", line 611, in call node = self.make_node(_inputs, *_kwargs) File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 597, in make_node out_broadcastables)] File "/home/venv/local/lib/python2.7/site-packages/theano/gof/type.py", line 400, in call return utils.add_tag_trace(self.make_variable(name)) File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/type.py", line 431, in make_variable return self.Variable(self, name=name) File "/home/venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 762, in init raise Exception(msg) Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.

ERROR (theano.gof.opt): Optimization failure due to: local_opt_alloc ERROR (theano.gof.opt): node: Sum{axis=[1], acc_dtype=float64}(Alloc.0) ... Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.

Epoch 1/5 320/320 [==============================] - 0s - loss: 25.0333 Epoch 2/5 320/320 [==============================] - 0s - loss: 23.8539 Epoch 3/5 320/320 [==============================] - 0s - loss: 23.5233 Epoch 4/5 320/320 [==============================] - 0s - loss: 23.4157 Epoch 5/5 320/320 [==============================] - 0s - loss: 23.3869 $

It seems there is some multiplication between a constant float32 and int64 happening:

(Pdb) self <theano.tensor.elemwise.Elemwise object at 0x7faefc245a90> (Pdb) inputs TensorConstant{0.0}, Elemwise{mul,no_inplace}.0 inputs[0].type TensorType(float32, scalar) (Pdb) inputs[1].type TensorType(int64, scalar)

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/2209

I think this was fixed in Theano 0.8.2.

@gw0 can you confirm? Otherwise, any idea what might be causing an upcast?

Unfortunately the upcasting is still there. :(

$ pip freeze | egrep -i 'theano|keras'
Keras==1.0.1
Theano==0.8.2

$ THEANO_FLAGS='floatX=float32,warn_float64=raise' python test_upcast-float64.py
Using Theano backend.
ERROR (theano.gof.opt): Optimization failure due to: local_opt_alloc
ERROR (theano.gof.opt): node: Sum{axis=[1], acc_dtype=float64}(Alloc.0)
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
  File "./venv/local/lib/python2.7/site-packages/theano/gof/opt.py", line 1772, in process_node
    replacements = lopt.transform(node)
  File "./venv/local/lib/python2.7/site-packages/theano/tensor/opt.py", line 5137, in local_opt_alloc
    val *= T.mul(*to_prod)
  File "./venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 240, in __rmul__
    return theano.tensor.basic.mul(other, self)
  File "./venv/local/lib/python2.7/site-packages/theano/gof/op.py", line 611, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "./venv/local/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 597, in make_node
    out_broadcastables)]
  File "./venv/local/lib/python2.7/site-packages/theano/gof/type.py", line 400, in __call__
    return utils.add_tag_trace(self.make_variable(name))
  File "./venv/local/lib/python2.7/site-packages/theano/tensor/type.py", line 431, in make_variable
    return self.Variable(self, name=name)
  File "./venv/local/lib/python2.7/site-packages/theano/tensor/var.py", line 762, in __init__
    raise Exception(msg)
Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.
...

It seems something was changed in Theano, but the int64 still comes out of an element-wise multiplication. (Sorry for the extra-long debugging attempt, but I am not sure what is essential.)

$ THEANO_FLAGS='floatX=float32,warn_float64=pdb' python test_upcast-float64.py      
... (enter "u" a couple of times) ...
> ./venv/local/lib/python2.7/site-packages/theano/gof/op.py(611)__call__()
-> node = self.make_node(*inputs, **kwargs)
(Pdb) l
606                 when the output of `make_node()` contains a single element, or this
607                 output (unchanged) when it contains multiple elements.
608  
609             """
610             return_list = kwargs.pop('return_list', False)
611  ->         node = self.make_node(*inputs, **kwargs)
612  
613             if config.compute_test_value != 'off':
614                 run_perform = True
615  
616                 # build test input-values
(Pdb) self
<theano.tensor.elemwise.Elemwise object at 0x7f9f3e8ebcd0>
(Pdb) inputs
(0.0, Elemwise{mul,no_inplace}.0)
(Pdb) inputs[0].type
*** AttributeError: 'numpy.float32' object has no attribute 'type'
(Pdb) inputs[1].type
TensorType(int64, scalar)
... (enter "u" two times) ...
> ./venv/local/lib/python2.7/site-packages/theano/tensor/opt.py(5137)local_opt_alloc()
-> val *= T.mul(*to_prod)
(Pdb) l
5132                        val = val.reshape(1)[0]
5133                        to_prod = [shapes[i] for i in xrange(len(shapes))
5134                                   if i in node.op.axis]
5135                        if to_prod:
5136                            if isinstance(node.op, T.Sum):
5137 ->                             val *= T.mul(*to_prod)
5138                            else:
5139                                val = val ** T.mul(*to_prod)
5140                        return [T.alloc(T.cast(val, dtype=node.outputs[0].dtype),
5141                                        *[shapes[i] for i in xrange(len(shapes))
5142                                          if i not in node.op.axis])]
(Pdb) node
Sum{axis=[1], acc_dtype=float64}(Alloc.0)
(Pdb) val
0.0
(Pdb) to_prod
[Shape_i{1}.0]
(Pdb) to_prod[0].type
TensorType(int64, scalar)
(Pdb) node.op.axis
(1,)
(Pdb) shapes
[Shape_i{0}.0, Shape_i{1}.0, Shape_i{2}.0]
(Pdb) shapes[0].type
TensorType(int64, scalar)
(Pdb) shapes[1].type
TensorType(int64, scalar)
(Pdb) shapes[2].type
TensorType(int64, scalar)
(Pdb) node_inps.owner.inputs
[TensorConstant{(1, 1, 1) of 0.0}, Shape_i{0}.0, Shape_i{1}.0, Shape_i{2}.0]
(Pdb) node
Sum{axis=[1], acc_dtype=float64}(Alloc.0)
... (enter "u") ...
> ./venv/local/lib/python2.7/site-packages/theano/gof/opt.py(2196)apply()
-> lopt_change = self.process_node(fgraph, node, lopt)
(Pdb) l
2191                        for lopt in (self.local_optimizers_all +
2192                                     self.local_optimizers_map.get(type(node.op), []) +
2193                                     self.local_optimizers_map.get(node.op, [])):
2194                            nb = change_tracker.nb_imported
2195                            t_opt = time.time()
2196 ->                         lopt_change = self.process_node(fgraph, node, lopt)
2197                            time_opts[lopt] += time.time() - t_opt
2198                            if not lopt_change:
2199                                continue
2200                            process_count.setdefault(lopt, 0)
2201                            process_count[lopt] += 1
(Pdb) theano.printing.debugprint(node.inputs[0])
Alloc [id A] ''   
 |TensorConstant{(1, 1, 1) of 0.0} [id B]
 |Shape_i{0} [id C] ''   
 | |simplernn_input_1 [id D]
 |Shape_i{1} [id E] ''   
 | |simplernn_input_1 [id D]
 |Shape_i{2} [id F] ''   
   |simplernn_input_1 [id D]
(Pdb) theano.printing.pydotprint(node.inputs[0], outfile="node.png", var_with_name_simple=True)
The output file is available at node.png

So it seems this is related to shape calculation, somewhere where three instances/scalars of simplernn_input_1 (of type int64) are used? Maybe the above float64 are only related to shape computation and whole matrices are still float32?

But int64 and float64 appear multiple times in the whole computation graph:

(Pdb) theano.printing.debugprint(fgraph)                                                     
Elemwise{true_div,no_inplace} [id A] 'mean'   102
 |Elemwise{true_div,no_inplace} [id B] ''   100
 | |Sum{acc_dtype=float64} [id C] ''   98
 | | |Elemwise{mul,no_inplace} [id D] ''   96
 | |   |Elemwise{neg,no_inplace} [id E] 'mean'   94
 | |   | |Sum{axis=[1], acc_dtype=float64} [id F] ''   91
 | |   |   |Elemwise{mul,no_inplace} [id G] ''   88
 | |   |     |simplernn_1_target [id H]
 | |   |     |Elemwise{log,no_inplace} [id I] ''   86
 | |   |       |Elemwise{clip,no_inplace} [id J] ''   81
 | |   |         |Elemwise{true_div,no_inplace} [id K] ''   80
 | |   |         | |Subtensor{int64} [id L] ''   73
 | |   |         | | |DimShuffle{0,1,2} [id M] ''   70
 | |   |         | | | |for{cpu,scan_fn}.1 [id N] ''   69
 | |   |         | | |   |TensorConstant{8} [id O]
 | |   |         | | |   |DimShuffle{1,0,2} [id P] ''   61
 | |   |         | | |   | |Reshape{3} [id Q] ''   54
 | |   |         | | |   |   |Elemwise{add,no_inplace} [id R] ''   45
 | |   |         | | |   |   | |Dot22 [id S] ''   30
 | |   |         | | |   |   | | |Reshape{2} [id T] ''   9
 | |   |         | | |   |   | | | |simplernn_input_1 [id U]
 | |   |         | | |   |   | | | |TensorConstant{[-1 16]} [id V]
 | |   |         | | |   |   | | |simplernn_1_W [id W]
 | |   |         | | |   |   | |DimShuffle{x,0} [id X] ''   8
 | |   |         | | |   |   |   |simplernn_1_b [id Y]
 | |   |         | | |   |   |TensorConstant{[-1  8 10]} [id Z]
 | |   |         | | |   |IncSubtensor{Set;:int64:} [id BA] ''   67
 | |   |         | | |   | |AllocEmpty{dtype='float32'} [id BB] ''   23
 | |   |         | | |   | | |TensorConstant{9} [id BC]
 | |   |         | | |   | | |Shape_i{0} [id BD] ''   3
 | |   |         | | |   | | | |simplernn_input_1 [id U]
 | |   |         | | |   | | |Shape_i{1} [id BE] ''   7
 | |   |         | | |   | |   |<TensorType(float32, matrix)> [id BF]
 | |   |         | | |   | |Rebroadcast{0} [id BG] ''   64
 | |   |         | | |   | | |DimShuffle{x,0,1} [id BH] ''   58
 | |   |         | | |   | |   |Dot22 [id BI] ''   51
 | |   |         | | |   | |     |Sum{axis=[1], acc_dtype=float64} [id BJ] ''   41
 | |   |         | | |   | |     | |Alloc [id BK] ''   20
 | |   |         | | |   | |     |   |TensorConstant{(1, 1, 1) of 0.0} [id BL]
 | |   |         | | |   | |     |   |Shape_i{0} [id BD] ''   3
 | |   |         | | |   | |     |   |Shape_i{1} [id BM] ''   2
 | |   |         | | |   | |     |   | |simplernn_input_1 [id U]
 | |   |         | | |   | |     |   |Shape_i{2} [id BN] ''   1
 | |   |         | | |   | |     |     |simplernn_input_1 [id U]
 | |   |         | | |   | |     |<TensorType(float32, matrix)> [id BF]
 | |   |         | | |   | |Constant{1} [id BO]
 | |   |         | | |   |TensorConstant{8} [id O]
 | |   |         | | |   |simplernn_1_U [id BP]
 | |   |         | | |Constant{-1} [id BQ]
 | |   |         | |DimShuffle{0,x} [id BR] ''   79
 | |   |         |   |Sum{axis=[1], acc_dtype=float64} [id BS] ''   76
 | |   |         |     |Subtensor{int64} [id L] ''   73
 | |   |         |TensorConstant{(1, 1) of 1e-07} [id BT]
 | |   |         |TensorConstant{(1, 1) of 1.0} [id BU]
 | |   |simplernn_1_sample_weights [id BV]
 | |Elemwise{true_div,no_inplace} [id BW] 'mean'   56
 |   |Sum{acc_dtype=float64} [id BX] ''   46
 |   | |Elemwise{Cast{float32}} [id BY] ''   32
 |   |   |Elemwise{neq,no_inplace} [id BZ] ''   10
 |   |     |simplernn_1_sample_weights [id BV]
 |   |     |TensorConstant{(1,) of 0} [id CA]
 |   |Elemwise{Cast{float32}} [id CB] ''   33
 |     |Shape_i{0} [id CC] ''   12
 |       |simplernn_1_sample_weights [id BV]
 |Elemwise{Cast{float32}} [id CD] ''   34
   |Shape_i{0} [id CE] ''   13
     |simplernn_1_target [id H]
Elemwise{sub,no_inplace} [id CF] ''   144
 |simplernn_1_W [id W]
 |Elemwise{true_div,no_inplace} [id CG] ''   141
   |Dot22Scalar [id CH] ''   118
   | |DimShuffle{1,0} [id CI] ''   31
   | | |Reshape{2} [id T] ''   9
   | |Reshape{2} [id CJ] ''   114
   | | |DimShuffle{1,0,2} [id CK] ''   112
   | | | |IncSubtensor{Inc;int64::} [id CL] ''   110
   | | |   |Alloc [id CM] ''   63
   | | |   | |TensorConstant{(1, 1, 1) of 0.0} [id BL]
   | | |   | |TensorConstant{8} [id CN]
   | | |   | |Elemwise{int_div,no_inplace} [id CO] ''   57
   | | |   | | |Elemwise{mul,no_inplace} [id CP] ''   50
   | | |   | | | |Elemwise{int_div,no_inplace} [id CQ] ''   40
   | | |   | | | | |Elemwise{mul,no_inplace} [id CR] ''   19
   | | |   | | | | | |Shape_i{0} [id BD] ''   3
   | | |   | | | | | |Shape_i{1} [id BM] ''   2
   | | |   | | | | | |Shape_i{2} [id BN] ''   1
   | | |   | | | | |TensorConstant{16} [id CS]
   | | |   | | | |Shape_i{1} [id CT] ''   0
   | | |   | | |   |simplernn_1_W [id W]
   | | |   | | |TensorConstant{80} [id CU]
   | | |   | |TensorConstant{10} [id CV]
   | | |   |IncSubtensor{Inc;:int64:} [id CW] ''   108
   | | |   | |Alloc [id CM] ''   63
   | | |   | |Subtensor{::int64} [id CX] ''   106
   | | |   | | |for{cpu,grad_of_scan_fn}.1 [id CY] ''   105
   | | |   | | | |TensorConstant{8} [id O]
   | | |   | | | |Elemwise{sub} [id CZ] ''   77
   | | |   | | | | |TensorConstant{(1, 1, 1) of 1.0} [id DA]
   | | |   | | | | |Elemwise{sqr} [id DB] ''   74
   | | |   | | | |   |Subtensor{int64:int64:int64} [id DC] ''   71
   | | |   | | | |     |for{cpu,scan_fn}.0 [id N] ''   69
   | | |   | | | |     |Constant{8} [id DD]
   | | |   | | | |     |Constant{0} [id DE]
   | | |   | | | |     |Constant{-1} [id BQ]
   | | |   | | | |Subtensor{int64:int64:int64} [id DF] ''   104
   | | |   | | | | |DimShuffle{0,1,2} [id DG] ''   103
   | | |   | | | | | |IncSubtensor{Inc;int64} [id DH] ''   101
   | | |   | | | | |   |Alloc [id DI] ''   62
   | | |   | | | | |   | |TensorConstant{(1, 1, 1) of 0.0} [id BL]
   | | |   | | | | |   | |TensorConstant{8} [id O]
   | | |   | | | | |   | |Elemwise{int_div,no_inplace} [id CO] ''   57
   | | |   | | | | |   | |TensorConstant{10} [id DJ]
   | | |   | | | | |   |Elemwise{add,no_inplace} [id DK] ''   99
   | | |   | | | | |   | |Elemwise{true_div,no_inplace} [id DL] ''   93
   | | |   | | | | |   | | |Elemwise{mul,no_inplace} [id DM] ''   90
   | | |   | | | | |   | | | |TensorConstant{(1, 1) of -1.0} [id DN]
   | | |   | | | | |   | | | |Elemwise{AND} [id DO] ''   87
   | | |   | | | | |   | | | | |Elemwise{GE} [id DP] ''   83
   | | |   | | | | |   | | | | | |Elemwise{true_div,no_inplace} [id K] ''   80
   | | |   | | | | |   | | | | | |TensorConstant{(1, 1) of 1e-07} [id BT]
   | | |   | | | | |   | | | | |Elemwise{LE} [id DQ] ''   82
   | | |   | | | | |   | | | |   |Elemwise{true_div,no_inplace} [id K] ''   80
   | | |   | | | | |   | | | |   |TensorConstant{(1, 1) of 1.0} [id BU]
   | | |   | | | | |   | | | |DimShuffle{x,x} [id DR] ''   47
   | | |   | | | | |   | | | | |Elemwise{Cast{float32}} [id CB] ''   33
   | | |   | | | | |   | | | |DimShuffle{0,x} [id DS] ''   11
   | | |   | | | | |   | | | | |simplernn_1_sample_weights [id BV]
   | | |   | | | | |   | | | |simplernn_1_target [id H]
   | | |   | | | | |   | | |Elemwise{mul,no_inplace} [id DT] ''   85
   | | |   | | | | |   | |   |DimShuffle{x,x} [id DU] ''   48
   | | |   | | | | |   | |   | |Elemwise{Cast{float32}} [id CD] ''   34
   | | |   | | | | |   | |   |DimShuffle{x,x} [id DV] ''   55
   | | |   | | | | |   | |   | |Sum{acc_dtype=float64} [id BX] ''   46
   | | |   | | | | |   | |   |Elemwise{clip,no_inplace} [id J] ''   81
   | | |   | | | | |   | |   |DimShuffle{0,x} [id BR] ''   79
   | | |   | | | | |   | |DimShuffle{0,x} [id DW] ''   97
   | | |   | | | | |   |   |Sum{axis=[1], acc_dtype=float64} [id DX] ''   95
   | | |   | | | | |   |     |Elemwise{true_div,no_inplace} [id DY] ''   92
   | | |   | | | | |   |       |Elemwise{mul,no_inplace} [id DZ] ''   89
   | | |   | | | | |   |       | |Elemwise{AND} [id DO] ''   87
   | | |   | | | | |   |       | |DimShuffle{x,x} [id DR] ''   47
   | | |   | | | | |   |       | |DimShuffle{0,x} [id DS] ''   11
   | | |   | | | | |   |       | |simplernn_1_target [id H]
   | | |   | | | | |   |       | |Subtensor{int64} [id L] ''   73
   | | |   | | | | |   |       |Elemwise{mul,no_inplace} [id EA] ''   84
   | | |   | | | | |   |         |DimShuffle{x,x} [id DU] ''   48
   | | |   | | | | |   |         |DimShuffle{x,x} [id DV] ''   55
   | | |   | | | | |   |         |Elemwise{clip,no_inplace} [id J] ''   81
   | | |   | | | | |   |         |DimShuffle{0,x} [id BR] ''   79
   | | |   | | | | |   |         |DimShuffle{0,x} [id BR] ''   79
   | | |   | | | | |   |Constant{-1} [id BQ]
   | | |   | | | | |Constant{7} [id EB]
   | | |   | | | | |Constant{-9} [id EC]
   | | |   | | | | |Constant{-1} [id BQ]
   | | |   | | | |Alloc [id ED] ''   22
   | | |   | | | | |TensorConstant{0.0} [id EE]
   | | |   | | | | |TensorConstant{9} [id BC]
   | | |   | | | | |Shape_i{0} [id BD] ''   3
   | | |   | | | | |Shape_i{1} [id BE] ''   7
   | | |   | | | |TensorConstant{8} [id O]
   | | |   | | | |DimShuffle{1,0} [id EF] ''   4
   | | |   | | |   |simplernn_1_U [id BP]
   | | |   | | |Constant{-1} [id BQ]
   | | |   | |Constant{8} [id DD]
   | | |   |Constant{0} [id DE]
   | | |MakeVector{dtype='int64'} [id EG] ''   49
   | |   |Elemwise{int_div,no_inplace} [id CQ] ''   40
   | |   |Shape_i{1} [id CT] ''   0
   | |<TensorType(float32, scalar)> [id EH]
   |Elemwise{sqrt,no_inplace} [id EI] ''   138
     |Elemwise{clip,no_inplace} [id EJ] ''   135
       |Elemwise{add,no_inplace} [id EK] ''   132
       | |TensorConstant{(1, 1) of 1e-06} [id EL]
       | |Elemwise{mul,no_inplace} [id EM] ''   37
       | | |DimShuffle{x,x} [id EN] ''   15
       | | | |<TensorType(float32, scalar)> [id EO]
       | | |<TensorType(float32, matrix)> [id EP]
       | |Elemwise{mul,no_inplace} [id EQ] ''   126
       |   |DimShuffle{x,x} [id ER] ''   39
       |   | |Elemwise{sub,no_inplace} [id ES] ''   16
       |   |   |TensorConstant{1.0} [id ET]
       |   |   |<TensorType(float32, scalar)> [id EO]
       |   |Elemwise{sqr,no_inplace} [id EU] ''   123
       |     |Dot22 [id EV] ''   117
       |       |DimShuffle{1,0} [id CI] ''   31
       |       |Reshape{2} [id CJ] ''   114
       |TensorConstant{(1, 1) of 0.0} [id EW]
       |TensorConstant{(1, 1) of inf} [id EX]
Elemwise{sub,no_inplace} [id EY] ''   143
 |simplernn_1_b [id Y]
 |Elemwise{true_div,no_inplace} [id EZ] ''   140
   |Elemwise{mul,no_inplace} [id FA] ''   122
   | |DimShuffle{x} [id FB] ''   18
   | | |<TensorType(float32, scalar)> [id EH]
   | |Sum{axis=[0], acc_dtype=float64} [id FC] ''   116
   |   |Reshape{2} [id CJ] ''   114
   |Elemwise{sqrt,no_inplace} [id FD] ''   137
     |Elemwise{clip,no_inplace} [id FE] ''   134
       |Elemwise{add,no_inplace} [id FF] ''   130
       | |TensorConstant{(1,) of 1e-06} [id FG]
       | |Elemwise{mul,no_inplace} [id FH] ''   35
       | | |DimShuffle{x} [id FI] ''   14
       | | | |<TensorType(float32, scalar)> [id EO]
       | | |<TensorType(float32, vector)> [id FJ]
       | |Elemwise{mul,no_inplace} [id FK] ''   125
       |   |DimShuffle{x} [id FL] ''   38
       |   | |Elemwise{sub,no_inplace} [id ES] ''   16
       |   |Elemwise{sqr,no_inplace} [id FM] ''   121
       |     |Sum{axis=[0], acc_dtype=float64} [id FC] ''   116
       |TensorConstant{(1,) of 0.0} [id FN]
       |TensorConstant{(1,) of inf} [id FO]
Elemwise{sub,no_inplace} [id FP] ''   142
 |simplernn_1_U [id BP]
 |Elemwise{true_div,no_inplace} [id FQ] ''   139
   |Elemwise{mul,no_inplace} [id FR] ''   120
   | |DimShuffle{x,x} [id FS] ''   17
   | | |<TensorType(float32, scalar)> [id EH]
   | |Elemwise{second,no_inplace} [id FT] ''   115
   |   |TensorConstant{(1, 1) of 0.0} [id FU]
   |   |Elemwise{second,no_inplace} [id FV] ''   113
   |     |Assert{msg='Theano Assert failed!'} [id FW] ''   111
   |     | |Dot22 [id FX] ''   109
   |     | | |Reshape{2} [id FY] ''   78
   |     | | | |DimShuffle{2,0,1} [id FZ] ''   75
   |     | | | | |Subtensor{int64:int64:int64} [id GA] ''   72
   |     | | | |   |for{cpu,scan_fn}.0 [id N] ''   69
   |     | | | |   |Constant{7} [id EB]
   |     | | | |   |Constant{-10} [id GB]
   |     | | | |   |Constant{-1} [id BQ]
   |     | | | |MakeVector{dtype='int64'} [id GC] ''   29
   |     | | |   |Shape_i{1} [id BE] ''   7
   |     | | |   |TensorConstant{-1} [id GD]
   |     | | |Reshape{2} [id GE] ''   107
   |     | |   |for{cpu,grad_of_scan_fn}.1 [id CY] ''   105
   |     | |   |MakeVector{dtype='int64'} [id GF] ''   43
   |     | |     |Elemwise{mul,no_inplace} [id GG] ''   21
   |     | |     | |TensorConstant{8} [id O]
   |     | |     | |Shape_i{0} [id BD] ''   3
   |     | |     |Shape_i{1} [id BE] ''   7
   |     | |Elemwise{eq,no_inplace} [id GH] ''   66
   |     | | |Shape_i{0} [id GI] ''   6
   |     | | | |simplernn_1_U [id BP]
   |     | | |Elemwise{switch,no_inplace} [id GJ] ''   60
   |     | |   |Elemwise{eq,no_inplace} [id GK] ''   28
   |     | |   | |Shape_i{1} [id BE] ''   7
   |     | |   | |TensorConstant{-1} [id GL]
   |     | |   |Elemwise{int_div,no_inplace} [id GM] ''   53
   |     | |   | |Elemwise{mul,no_inplace} [id GN] ''   27
   |     | |   | | |Shape_i{1} [id BE] ''   7
   |     | |   | | |TensorConstant{8} [id GO]
   |     | |   | | |Shape_i{0} [id BD] ''   3
   |     | |   | |Elemwise{neg,no_inplace} [id GP] ''   44
   |     | |   |   |Elemwise{mul,no_inplace} [id GQ] ''   26
   |     | |   |     |Shape_i{1} [id BE] ''   7
   |     | |   |     |TensorConstant{-1} [id GD]
   |     | |   |Shape_i{1} [id BE] ''   7
   |     | |Elemwise{eq,no_inplace} [id GR] ''   68
   |     |   |Shape_i{1} [id GS] ''   5
   |     |   | |simplernn_1_U [id BP]
   |     |   |Elemwise{switch,no_inplace} [id GT] ''   65
   |     |     |Elemwise{eq,no_inplace} [id GU] ''   25
   |     |     | |Shape_i{1} [id BE] ''   7
   |     |     | |TensorConstant{-1} [id GV]
   |     |     |Elemwise{int_div,no_inplace} [id GW] ''   59
   |     |     | |Elemwise{mul,no_inplace} [id GX] ''   24
   |     |     | | |TensorConstant{8} [id O]
   |     |     | | |Shape_i{0} [id BD] ''   3
   |     |     | | |Shape_i{1} [id BE] ''   7
   |     |     | |Elemwise{neg,no_inplace} [id GY] ''   52
   |     |     |   |Elemwise{mul,no_inplace} [id GZ] ''   42
   |     |     |     |Elemwise{mul,no_inplace} [id GG] ''   21
   |     |     |     |Shape_i{1} [id BE] ''   7
   |     |     |Shape_i{1} [id BE] ''   7
   |     |Assert{msg='Theano Assert failed!'} [id FW] ''   111
   |Elemwise{sqrt,no_inplace} [id HA] ''   136
     |Elemwise{clip,no_inplace} [id HB] ''   133
       |Elemwise{add,no_inplace} [id HC] ''   128
       | |TensorConstant{(1, 1) of 1e-06} [id EL]
       | |Elemwise{mul,no_inplace} [id HD] ''   36
       | | |DimShuffle{x,x} [id EN] ''   15
       | | |<TensorType(float32, matrix)> [id HE]
       | |Elemwise{mul,no_inplace} [id HF] ''   124
       |   |DimShuffle{x,x} [id ER] ''   39
       |   |Elemwise{sqr,no_inplace} [id HG] ''   119
       |     |Elemwise{second,no_inplace} [id FT] ''   115
       |TensorConstant{(1, 1) of 0.0} [id EW]
       |TensorConstant{(1, 1) of inf} [id EX]
Elemwise{add,no_inplace} [id HH] ''   131
 |Elemwise{mul,no_inplace} [id EM] ''   37
 |Elemwise{mul,no_inplace} [id EQ] ''   126
Elemwise{add,no_inplace} [id HI] ''   127
 |Elemwise{mul,no_inplace} [id HD] ''   36
 |Elemwise{mul,no_inplace} [id HF] ''   124
Elemwise{add,no_inplace} [id HJ] ''   129
 |Elemwise{mul,no_inplace} [id FH] ''   35
 |Elemwise{mul,no_inplace} [id FK] ''   125

Inner graphs of the scan ops:

for{cpu,scan_fn}.1 [id N] ''   
 >Elemwise{tanh,no_inplace} [id HK] ''   
 > |Elemwise{add,no_inplace} [id HL] ''   
 >   |<TensorType(float32, matrix)> [id HM] -> [id P]
 >   |dot [id HN] ''   
 >     |Elemwise{mul,no_inplace} [id HO] ''   
 >     | |<TensorType(float32, matrix)> [id HP] -> [id BA]
 >     | |TensorConstant{(1, 1) of 1.0} [id HQ]
 >     |simplernn_1_U_copy [id HR] -> [id BP]
 >Elemwise{tanh,no_inplace} [id HK] ''   

for{cpu,grad_of_scan_fn}.1 [id CY] ''   
 >Elemwise{add,no_inplace} [id HS] ''   
 > |Elemwise{mul} [id HT] ''   
 > | |dot [id HU] ''   
 > | | |Elemwise{mul} [id HV] ''   
 > | | | |Elemwise{add,no_inplace} [id HW] ''   
 > | | | | |<TensorType(float32, matrix)> [id HX] -> [id ED]
 > | | | | |<TensorType(float32, matrix)> [id HY] -> [id DF]
 > | | | |<TensorType(float32, matrix)> [id HZ] -> [id CZ]
 > | | |simplernn_1_U_copy.T_replace [id IA] -> [id EF]
 > | |TensorConstant{(1, 1) of 1.0} [id IB]
 > |<TensorType(float32, matrix)> [id IC] -> [id ED]
 >Elemwise{mul} [id ID] ''   
 > |Elemwise{add,no_inplace} [id IE] ''   
 > | |<TensorType(float32, matrix)> [id HX] -> [id ED]
 > | |<TensorType(float32, matrix)> [id HY] -> [id DF]
 > |<TensorType(float32, matrix)> [id HZ] -> [id CZ]

for{cpu,scan_fn}.0 [id N] ''   
 >Elemwise{tanh,no_inplace} [id HK] ''   
 >Elemwise{tanh,no_inplace} [id HK] ''   

for{cpu,scan_fn}.0 [id N] ''   
 >Elemwise{tanh,no_inplace} [id HK] ''   
 >Elemwise{tanh,no_inplace} [id HK] ''   

for{cpu,grad_of_scan_fn}.1 [id CY] ''   
 >Elemwise{add,no_inplace} [id HS] ''   
 >Elemwise{mul} [id ID] ''
(Pdb) theano.printing.pydotprint(fgraph, outfile="fgraph.png", var_with_name_simple=True)                
The output file is available at fgraph.png

Check if setting theano.config.cast_policy = numpy+floatX is effective for you. See Theano's basic.py upcast implementation.

Don't try that flag. It won't help and using it isn't recommanded. We probably need to get rid of it.

Can you try this diff:

diff --git a/theano/tensor/opt.py b/theano/tensor/opt.py index 0ab4a91..c312904 100644 --- a/theano/tensor/opt.py +++ b/theano/tensor/opt.py @@ -5179,7 +5179,7 @@ def local_opt_alloc(node): if i in node.op.axis] if to_prod: if isinstance(node.op, T.Sum):

val _= T.mul(_to_prod)
val _= T.mul(_to_prod).astype(val.dtype) else: val = val * T.mul(to_prod) return [T.alloc(T.cast(val, dtype=node.outputs[0].dtype),

On Wed, Apr 27, 2016 at 9:12 AM, deepworx notifications@github.com wrote:

Check if setting theano.config.cast_policy = numpy+floatX is effective for you. See Theano's basic.py upcast implementation.

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/2209#issuecomment-215077128

@nouiz :can`t see the diff you refer to, it's just three dots which jump me to the top of the page

well I agree that using that flag will result in less than optimal code quality. I think that I`ve traced the problem to shape/size operations where the size is int64 and combined with float32 it is being upcast to float64. It helps because it reduces to float32 all calls with int64,float32, int64's are creeping from everyplace in the code.. IMHO.

@nouiz :+1: found the diff, in the maillist. It works when slightly modified to T.mul(*to_prod).astype(str(val.dtype)).

Thanks.

Do you want to make a PR? I'm pretty overloaded currently.

On Wed, Apr 27, 2016 at 12:18 PM, deepworx notifications@github.com wrote:

@nouiz https://github.com/nouiz [image: :+1:] found the diff, in the maillist. It works when slightly modified to T.mul(*to_prod).astype(str(val.dtype)).

Thanks.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/2209#issuecomment-215136350

This was fixed by https://github.com/Theano/Theano/pull/4450. This issue can be closed.

This still seem happens with theano 0.9.0-dev4

evpok, you probably have another problem causing an upcast. If you didn't fix it now, I support you start a new thread.

On Sat, Nov 19, 2016 at 6:36 PM, François Chollet notifications@github.com wrote:

Closed #2209 https://github.com/fchollet/keras/issues/2209.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fchollet/keras/issues/2209#event-865477437, or mute the thread https://github.com/notifications/unsubscribe-auth/AALC-1YIXR0aRMjbwoy5wmEMP38iY--gks5q_4f1gaJpZM4IBCnK .

keras-team / keras

Upcast to `float64` when using any RNN #2209

Problem example that casts to float64# THEANO_FLAGS='floatX=float32,warn_float64=raise' python foo.py

generate dummy training data