baidu-research / ba-dls-deepspeech

Apache License 2.0
486 stars 174 forks source link

Training stops with a "ValueError: total size of new array must be unchanged" #9

Closed aglotero closed 7 years ago

aglotero commented 7 years ago

I have generated the training data using the LibriSpeech audio files (using the download.sh, flac_to_wav.sh and create_desc_file.py scripts), I have installed :

using this command line:

THEANO_FLAGS='optimizer=None' python train.py dev-clean.json dev-other.json meu_modelo/

I got this output error: ValueError: total size of new array must be unchanged

Full stacktrace:

Using Theano backend.
Using gpu device 0: GeForce GTX 750 Ti (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5105)
/usr/local/lib/python2.7/dist-packages/Theano-0.8.2-py2.7.egg/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
/usr/local/lib/python2.7/dist-packages/Theano-0.8.2-py2.7.egg/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
2016-12-04 00:58:47,478 INFO    (data_generator) Reading description file: dev-clean.json for partition: train
2016-12-04 00:58:47,479 INFO    (data_generator) Reading description file: dev-other.json for partition: validation
2016-12-04 00:58:47,618 INFO    (model) Building gru model
2016-12-04 00:58:47,811 INFO    (model) Building train_fn
2016-12-04 00:58:50,405 INFO    (model) Building val_fn
2016-12-04 00:58:50,607 INFO    (data_generator) Iters: 2
Traceback (most recent call last):
  File "train.py", line 155, in <module>
    args.sortagrad)
  File "train.py", line 133, in main
    do_sortagrad=sortagrad)
  File "train.py", line 89, in train
    label_lengths, True])
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/backend/theano_backend.py", line 811, in __call__
    return self.function(*inputs)
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.2-py2.7.egg/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.2-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/Theano-0.8.2-py2.7.egg/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
ValueError: total size of new array must be unchanged
Apply node that caused the error: Reshape{3}(Elemwise{add,no_inplace}.0, MakeVector{dtype='int64'}.0)
Toposort index: 599
Inputs types: [TensorType(float32, matrix), TensorType(int64, vector)]
Inputs shapes: [(8424, 100), (3,)]
Inputs strides: [(400, 4), (8,)]
Inputs values: ['not shown', array([ -1, 100, 100])]
Outputs clients: [[Join(TensorConstant{2}, Reshape{3}.0, Reshape{3}.0, Reshape{3}.0), Shape(Reshape{3}.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "/export/ba-dls-deepspeech/model.py", line 118, in compile_gru_model
    return_sequences=True)(output)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/engine/topology.py", line 517, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/engine/topology.py", line 571, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/engine/topology.py", line 155, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/layers/recurrent.py", line 219, in call
    preprocessed_input = self.preprocess_input(x)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/layers/recurrent.py", line 541, in preprocess_input
    input_dim, self.output_dim, timesteps)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/layers/recurrent.py", line 38, in time_distributed_dense
    x = K.reshape(x, (-1, timesteps, output_dim))
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.1.2-py2.7.egg/keras/backend/theano_backend.py", line 480, in reshape
    return T.reshape(x, shape)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Some advice about what is going on ?

Regards, André

Ka-ya commented 7 years ago

I'm not sure on what the actual solution to this is, but an alternative is to try downgrading to an older Keras version.

aglotero commented 7 years ago

Thanks @Ka-ya !

I downgraded to 1.1.0 version and is working!

pip install Keras==1.1.0

Regards, André