coxlab / prednet

Code and models accompanying "Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning"
https://arxiv.org/abs/1605.08104
MIT License
759 stars 259 forks source link

Input array dimensions not matching #3

Closed LanaSina closed 7 years ago

LanaSina commented 7 years ago

I am trying to run PredNet using the pre-processed data and the pretrained model, nothing modified in the code. (Unrelated: It would be nice to indicate somewhere that your model doesn't support Python 3! Or maybe it was written and I missed it?)

For

python kitti_evaluate.py

I get the following error:

Using Theano backend.
Traceback (most recent call last):
  File "kitti_evaluate.py", line 52, in <module>
    X_hat = test_model.predict(X_test, batch_size)
  File "//anaconda/lib/python2.7/site-packages/keras/engine/training.py", line 1197, in predict
    batch_size=batch_size, verbose=verbose)
  File "//anaconda/lib/python2.7/site-packages/keras/engine/training.py", line 896, in _predict_loop
    batch_outs = f(ins_batch)
  File "//anaconda/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 792, in __call__
    return self.function(*inputs)
  File "//anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "//anaconda/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "//anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
  File "//anaconda/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 951, in rval
    r = p(n, [x[0] for x in i], o)
  File "//anaconda/lib/python2.7/site-packages/theano/scan_module/scan_op.py", line 940, in <lambda>
    self, node)
  File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/Users/lana/.theano/compiledir_Darwin-15.6.0-x86_64-i386-64bit-i386-2.7.10-64/scan_perform/mod.cpp:4316)
  File "//anaconda/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/Users/lana/.theano/compiledir_Darwin-15.6.0-x86_64-i386-64bit-i386-2.7.10-64/scan_perform/mod.cpp:4193)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Apply node that caused the error: Join(TensorConstant{1}, <TensorType(float32, 4D)>, <TensorType(float32, 4D)>, Reshape{4}.0)
Toposort index: 61
Inputs types: [TensorType(int8, scalar), TensorType(float32, 4D), TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(), (10, 96, 32, 40), (10, 192, 32, 40), (10, 384, 32, 20)]
Inputs strides: [(), (491520, 5120, 160, 4), (983040, 5120, 160, 4), (983040, 2560, 80, 4)]
Inputs values: [array(1, dtype=int8), 'not shown', 'not shown', 'not shown']
Outputs clients: [[CorrMM{half, (1, 1)}(Join.0, Subtensor{::, ::, ::int64, ::int64}.0), CorrMM{half, (1, 1)}(Join.0, Subtensor{::, ::, ::int64, ::int64}.0), CorrMM{half, (1, 1)}(Join.0, Subtensor{::, ::, ::int64, ::int64}.0), CorrMM{half, (1, 1)}(Join.0, Subtensor{::, ::, ::int64, ::int64}.0)]]
bill-lotter commented 7 years ago

Hmm I just tested again and everything works fine for me. What version of keras are you using? And can you check K.dim_ordering?

Sorry about the Python 3 issue, I'll add a note in the readme.

LanaSina commented 7 years ago

Hi, thanks for your answer!

I am using Keras 1.1.1, Theano backend. I just noticed you used version 1.0.7 so I installed it. I use Python 2.7. Still getting the same error. BTW dim_ordering is "th"

LanaSina commented 7 years ago

Not sure if relevant but I am running the code on CPU, planning to move to our GPU machine if it looks like I can get it to run without errors on my computer.

bill-lotter commented 7 years ago

Ah gotcha, I've never actually tried on CPU. Not sure why it wouldn't work/what would make it different, but I'll look into it. In the meantime, I would suggest trying it with the GPU.

LanaSina commented 7 years ago

Ok I'll do that. Can you just tell me the version of theano you are using?

bill-lotter commented 7 years ago

I'm using 0.9.0. I would just install the bleeding edge version: pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git

LanaSina commented 7 years ago

Hi, I tried on GPU and got the exact same error: "all the input array dimensions except for the concatenation axis must match exactly". I wonder if other people managed to get your code to work on other machines or if it is just me missing something obvious.

LanaSina commented 7 years ago

It seems that I have other problems in my configuration, so let me fix that first

MaratAkhmatnurov commented 7 years ago

I've got same problem Already tried to use Keras 1.0.7 and Theano 0.9.0, both CPU and GPU

GPU raises

 ValueError: GpuJoin: Wrong inputs for input 3 related to inputs 0.!
 Apply node that caused the error: GpuJoin(TensorConstant{1}, <CudaNdarrayType(float32, 4D)>, <CudaNdarrayType(float32, 4D)>, GpuReshape{4}.0)

GPU output in attachement prednet.txt

LanaSina commented 7 years ago

Sorry for the delay. I have not managed to run the code yet, but I am trying it on different configurations to see if I can pin down what is going wrong.

LanaSina commented 7 years ago

Can you tell me which version of cudnn and cuda you are using?

LanaSina commented 7 years ago

Or maybe you don't use cudnn? When I deactivate cudnn use but still run on gpu, I get a different kind of super cryptic error for kitti_evaluate.py so I tried kitti_train.py; that gave me a new input shape error

File "kitti_train.py", line 75, in <module> validation_data=val_generator, nb_val_samples=N_seq_val) File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1461, in fit_generator class_weight=class_weight) File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1233, in train_on_batch check_batch_dim=True) File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 979, in _standardize_user_data exception_prefix='model input') File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 111, in standardize_input_data str(array.shape)) Exception: Error when checking model input: expected input_1 to have shape (None, 10, 3, 128, 160) but got array with shape (4, 10, 128, 160, 3)

Any idea what kind of global setting might cause all these shape errors?

ghost commented 7 years ago

I am also getting the same exact input shape error as above when running kitti_train.py. Tried running prednet on various GPUs (k80, gtx980, m40) and on various systems as user and sudo, but no luck.

Using Theano backend.
Epoch 1/150
Traceback (most recent call last):
  File "kitti_train.py", line 75, in <module>
    validation_data=val_generator, nb_val_samples=N_seq_val)
  File "/home-2/roblim1/theano/lib/python2.7/site-packages/Keras-1.1.1-py2.7.egg/keras/engine/training.py", line 1475, in fit_generator
    class_weight=class_weight)
  File "/home-2/roblim1/theano/lib/python2.7/site-packages/Keras-1.1.1-py2.7.egg/keras/engine/training.py", line 1244, in train_on_batch
    check_batch_dim=True)
  File "/home-2/roblim1/theano/lib/python2.7/site-packages/Keras-1.1.1-py2.7.egg/keras/engine/training.py", line 987, in _standardize_user_data
    exception_prefix='model input')
  File "/home-2/roblim1/theano/lib/python2.7/site-packages/Keras-1.1.1-py2.7.egg/keras/engine/training.py", line 111, in standardize_input_data
    str(array.shape))
Exception: Error when checking model input: expected input_1 to have shape (None, 10, 3, 128, 160) but got array with shape (4, 10, 128, 160, 3)
LanaSina commented 7 years ago

This fix https://github.com/coxlab/prednet/pull/5/files from @EderSantana worked. Plus not using cudnn (I think that's an issue in my configuration). Thank you!

LanaSina commented 7 years ago

Update: that fixed the kitti_evaluate bug, but not the shape bug from kitty_train. Can someone confirm?

LanaSina commented 7 years ago

There is a default setting in the keras.json file in .keras mine had as default

"image_dim_ordering": "tf",

I just changed it back to th. Everything works now. In the end @bill-lotter 's 1st message was very close to the issue - it wasn't K.dim_ordering or the backend setting itself, but image_dim_ordering. I will mark the issue as closed.

bill-lotter commented 7 years ago

Sorry about this, I tried to make it compatible with both dimension orderings (e.g. https://github.com/coxlab/prednet/blob/master/data_utils.py#L22), but guess it didn't work. I'll work to make it compatible for both.