phunterlau / kaggle_statefarm

A simple baseline model set using MXNet for Kaggle StateFarm driver position identification
27 stars 7 forks source link

Incorrect shape error #1

Open opraveen opened 8 years ago

opraveen commented 8 years ago

I created the train and val .rec files, and when I run the training script with Inception-BN model, I notice this incorrect shape error:

$ ./run.cv_inception_bn.sh 2016-07-17 01:02:57,342 Node[0] start with arguments Namespace(batch_size=32, clip_gradient=5.0, data_dir='./', data_shape=224, dataset='ft', finetune_from='model/Inception_BN-0039', finetune_lr_scale=10, gpus='0', kv_store='local', load_epoch=None, log_dir='./tmp/', log_file=None, lr=0.001, lr_factor=1, lr_factor_epoch=1, model_prefix='./model/ckpt-shuffle1', network='inception-bn', num_classes=10, num_epochs=30, num_examples=216, train_dataset='sf1_train.rec', val_dataset='sf1_val.rec') 2016-07-17 01:02:57,342 Node[0] finetune from model/Inception_BN at epoch 39 [01:02:57] src/io/iter_image_recordio.cc:211: ImageRecordIOParser: ./sf1_train.rec, use 1 threads for decoding.. [01:02:57] src/io/./iter_normalize.h:218: Cannot find mean.bin: create mean image, this will take some time... [01:03:11] src/io/./iter_normalize.h:231: 10000 images processed, 13.6055 sec elapsed [01:03:21] src/io/./iter_normalize.h:231: 20000 images processed, 23.6031 sec elapsed [01:03:21] src/io/./iter_normalize.h:244: Save mean image to mean.bin.. [01:03:22] src/io/iter_image_recordio.cc:211: ImageRecordIOParser: ./sf1_val.rec, use 1 threads for decoding.. [01:03:22] src/io/./iter_normalize.h:103: Load mean image from mean.bin 2016-07-17 01:03:24,226 Node[0] lr_scale: {'fc1_ft_weight': 10, 'softmax_label': 10, 'fc1_ft_bias': 10} [01:03:24] ../mxnet/dmlc-core/include/dmlc/logging.h:235: [01:03:24] src/operator/./concat-inl.h:152: Check failed: (dshape[j]) == (tmp[j]) Incorrect shape[2]: (32,320,13,13). (first input shape: (32,576,14,14)) Traceback (most recent call last): File "train_inception_bn.py", line 92, in train_model.fit(args, net, get_iterator) File " ../kaggle_statefarm/inception/train_model.py", line 119, in fit epoch_end_callback = checkpoint) File "../mxnet/python/mxnet/model.py", line 746, in fit self._init_params(dict(data.provide_data+data.provide_label)) File "../mxnet/python/mxnet/model.py", line 486, in _init_params argshapes, , aux_shapes = self.symbol.infer_shape(_input_shapes) File "../mxnet/python/mxnet/symbol.py", line 453, in infer_shape return self._infer_shape_impl(False, args, *_kwargs) File "../mxnet/python/mxnet/symbol.py", line 513, in _infer_shape_impl ctypes.byref(complete))) File "../mxnet/python/mxnet/base.py", line 77, in check_call raise MXNetError(py_str(_LIB.MXGetLastError()))

phunterlau commented 8 years ago

seems like you used different shape for input rather than 224 as required for inception BN. VGG and inception BN use different input shapes as mentioned in the kaggle forum posts, so please not re-use VGG input to inception BN model.

lbin commented 8 years ago

This is the problem: mxnet #2585, pls check https://github.com/dmlc/mxnet/pull/2585

phunterlau commented 8 years ago

@lbin you are right, one needs to add pad=(1, 1) like

pool = mx.symbol.Pooling(data=data, kernel=(3, 3), stride=(2, 2), pad=(1, 1), pool_type='max', attr=mirror_attr)

while VGG has no problems