lethienhoa / DenseNet-NLP

Tensorflow implementation of Very Deep Convolutional Networks for Natural Language Processing
79 stars 22 forks source link

Adding dimensions error: Dimensions must be equal, but are 64 and 128 for 'res_unit_1_0/add' (op: 'Add') with input shapes: [?,1,129,64], [?,1,129,128] #4

Open bao-dai opened 7 years ago

bao-dai commented 7 years ago

Hi, I'm just running train.py, but it raised error when calling cnn = VDCNN():

Para

meters:
ALLOW_SOFT_PLACEMENT=True
BATCH_SIZE=128
CHECKPOINT_EVERY=1000
DROPOUT_KEEP_PROB=0.5
EVALUATE_EVERY=5000
L2_REG_LAMBDA=0.0
LOG_DEVICE_PLACEMENT=False
NUM_EPOCHS=50

Loading data...
Non-neutral instances processed: 10000
Non-neutral instances processed: 20000
Non-neutral instances processed: 30000
Non-neutral instances processed: 40000
Non-neutral instances processed: 50000
Non-neutral instances processed: 60000
Non-neutral instances processed: 70000
Non-neutral instances processed: 80000
Non-neutral instances processed: 90000
Non-neutral instances processed: 100000
Non-neutral instances processed: 110000
Non-neutral instances processed: 120000
Non-neutral instances processed: 130000
Non-neutral instances processed: 140000
Non-neutral instances processed: 150000
Non-neutral instances processed: 160000
Non-neutral instances processed: 170000
Non-neutral instances processed: 180000
Non-neutral instances processed: 190000
Loading done
x_char_seq_ind=(194544,)
y shape=(194544, 2)
Train/Dev split: 0/194544
2017-11-20 17:48:22.932910: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 258, 64)
(?, 1, 258, 64)
(?, 1, 258, 64)

(?, 1, 129, 64)
(?, 1, 129, 128)
(?, 1, 129, 128)

Traceback (most recent call last):
  File "train.py", line 64, in <module>
    cnn = VDCNN()
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/model.py", line 59, in __init__
    h = resUnit(h, num_filters_per_size[i], cnn_filter_size, i, j)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/model.py", line 41, in resUnit
    output = input_layer + part6
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 894, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 183, in add
    "Add", x=x, y=y, name=name)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
    set_shapes_for_outputs(ret)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
    require_shape_fn)
  File "/Users/dainguyen/python_workspace/Very-Deep-Convolutional-Networks-for-Natural-Language-Processing-master/vdcnn_venv/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 64 and 128 for 'res_unit_1_0/add' (op: 'Add') with input shapes: [?,1,129,64], [?,1,129,128].

Is there anyway that I can get rid of this? Is that the tensorflow version issue or something?

jalbalah commented 6 years ago

yeah same error, you can try returning just part6 (without input_layer), but then i'm wondering why highway_unit is commented out... i think this is a way to implement the shortcut, but it should throw an error to add two arrays by index that are different lengths (64 and 128). maybe, can add input_layer to only the first 64 indexes (pad the 64 tensor to match shape), or uncomment the highway unit? i am not sure how this model implements shortcuts

leifanus commented 6 years ago

The mismatch comes from updating num_filters_per_size. So my solution is in the function resUnit: part0 = slim.conv2d(input_layer, num_filters_per_size_i, [1, 1], activation_fn=None) and output = part6+part0

It should be fine in this way.

jalbalah commented 6 years ago

thank you for following up! 👍 :) i guess part0 is an identity map (since convolution is 1x1... [1,1]) that just reshapes to 128?

i tried to confirm the identity map, but it looks ok. slim is a c++ library. adding seems appropriate and resnet seems correct with ref. and original papers, resp.: page 2 figure 2 - https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf page 5 figure 2 - https://arxiv.org/pdf/1606.01781.pdf

notes: where it failed before (after going through filter0, starting filter1): input_layer = <tf.Tensor 'pool_0/MaxPool:0' shape=(?, 1, 15, 64) dtype=float32> part0 = <tf.Tensor 'res_unit_1_0/Conv/BiasAdd:0' shape=(?, 1, 15, 128) dtype=float32> part6 = <tf.Tensor 'res_unit_1_0/Conv_2/BiasAdd:0' shape=(?, 1, 15, 128) dtype=float32>

input_layer + part6 is original code throws error part0 + part6 is new code and works

leifanus commented 6 years ago

Yes, exactly!