titu1994 / keras-coordconv

Keras implementation of CoordConv for all Convolution layers
MIT License
148 stars 33 forks source link

3D CoordConv on GPU #8

Closed socathie closed 5 years ago

socathie commented 5 years ago

I encounter the following error when running on GPU (using Theano) with a 3D single-channel input, the same model compiles fine on my personal laptop (using Tensorflow).

Traceback (most recent call last):
  File "vgg_coord.py", line 52, in <module>
    model = VoxCoordCNN(x_vol, y_vol, z_vol)
  File "vgg_coord.py", line 22, in VoxCoordCNN
    out = CoordinateChannel3D()(X)
  File "/share1/anaconda3/envs/tensorflow-gpu/lib/python3.5/site-packages/keras/engine/topology.py", line 619, in __call__
    output = self.call(inputs, **kwargs)
  File "/data/cathie/cnn/coord.py", line 143, in call
    xx_ones = K.ones(K.stack([batch_shape, dim3]), dtype='int32')
  File "/share1/anaconda3/envs/tensorflow-gpu/lib/python3.5/site-packages/keras/backend/theano_backend.py", line 310, in ones
    return variable(np.ones(shape), dtype, name)
  File "/share1/anaconda3/envs/tensorflow-gpu/lib/python3.5/site-packages/numpy/core/numeric.py", line 188, in ones
    a = empty(shape, dtype, order)
TypeError: expected sequence object with len >= 0 or a single integer

It seems like a problem when using Theano backend..

titu1994 commented 5 years ago

File "/share1/anaconda3/envs/tensorflow-gpu/lib/python3.5/site-packages/keras/backend/theano_backend.py", line 310, in ones return variable(np.ones(shape), dtype, name)

Oh Theano. This occurs because Theano does not support a Tensor for its shape definition. It delegates the tensor building to Numpy first, and then wraps that ndarray into a Theano tensor.

Since batch size is unknown at model building time (it is generally set to None), therefore there is no way to pass the shape [batch_size, dim3] to the numpy array dynamically.

This needs a fix in the Keras backend, or a rewrite of the code + specifying a static batch size for the input shape in order to work on Theano.

I would suggest working with TF or CNTK for now.

socathie commented 5 years ago

Tried remaining in Theano and use Input(batch_shape=(batch_size,x,y,z,1)) instead. Still didn't work. I guess I would have to switch to TF

titu1994 commented 5 years ago

It would require a rewrite of the script as well, removing all of the K.stack() and only passing the shape arrays directly.

titu1994 commented 5 years ago

I just added a theano specific version of the script, called coord_theano.py. I think that may work, combined with a static batch shape.