tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
186.17k stars 74.28k forks source link

Iterate on Unknown Batch Size with Custom Layer #31991

Closed jmd-0 closed 5 years ago

jmd-0 commented 5 years ago

System information

Describe the current behavior I am attempting to build a custom TensorFlow Layer to perform K-Means clustering across channels of a given image. I am having difficulty creating this new layer to add to the model, as it seems that fundamentally, I don't have the ability to iterate over the batch size, which is unknown until runtime. I have tried a few alternatives such as the @tf.function function decorator and the tf.scan function, which have both been unsuccessful.

Describe the expected behavior I was expecting that since the batch size is unknown until runtime, that TensorFlow would be able to handle this error, similar to how TensorFlow can accept an unknown dimension and generate a matrix/tensor with the unknown shape.

Code to reproduce the issue

import numpy as np
import tensorflow as tf
from sklearn.cluster import KMeans

class KMeansLayer(tf.keras.layers.Layer):
    def __init__(self, num_clusters=8, n_init=5, trainable=False):
        super(KMeansLayer, self).__init__()
        self.clusters = num_clusters
        self.n_init = n_init
        self.trainable = trainable

    def build(self, input_shape):
        print('Input shape:', input_shape)
        self.output_s = (input_shape[0], input_shape[1], input_shape[2], 1)
        self.built = True

    def call(self, input):

        @tf.function
        def KMeansBase(input_mat, clusters, n_init):
            base_mat = tf.zeros((input_mat.shape[0], input_mat.shape[1] * input_mat.shape[2]))
            for frame in range(input_mat.shape[0]):
                init_mat = np.zeros((input_mat.shape[1] * input_mat.shape[2]))
                reshape_mat = tf.reshape(input_mat[frame], shape=(input_mat.shape[1] * input_mat.shape[2], input_mat.shape[3]))
                kmeans_init = KMeans(n_clusters=clusters, n_init=n_init)
                class_pred = kmeans_init.fit_predict(reshape_mat.numpy())

                for clust in range(clusters):
                    init_mat[class_pred == clust] = tf.keras.backend.mean(tf.boolean_mask(reshape_mat, class_pred == clust), axis=1).numpy()
                    init_mat[class_pred == clust] = np.mean(init_mat[class_pred == clust], axis=None)
                base_mat = tf.compat.v1.scatter_update(base_mat, frame, tf.convert_to_tensor(init_mat))
            base_mat = tf.reshape(base_mat, (input_mat.shape[0], input_mat.shape[1], input_mat.shape[2]))

            return tf.expand_dims(base_mat, axis=-1)

        return KMeansBase(input, clusters=self.clusters, n_init=self.n_init)

input_1 = tf.keras.Input(shape=(28, 28, 1), name='input_1', dtype='float32')
conv_1 = tf.keras.layers.Conv2D(filters=3, kernel_size=3, strides=1, padding='same', data_format='channels_last', activation='elu', kernel_initializer='glorot_uniform')(input_1)
kmeans_out = KMeansLayer(num_clusters=8, n_init=5)(conv_1)

model = tf.keras.Model(inputs=[input_1], outputs=kmeans_out)
tf.keras.utils.plot_model(model, show_shapes=True)
model.compile(optimizer='adam', loss='mse', metrics=['mse'])

The error that I get from running the above code is as follows:

Traceback (most recent call last):
  File "example_error_file.py", line 44, in <module>
    kmeans_out = KMeansLayer(num_clusters=8, n_init=5)(conv_1)
  File "~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 634, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 149, in wrapper
    raise e.ag_error_metadata.to_exception(type(e))
TypeError: in converted code:

    example_error_file.py:38 call *
        return KMeansBase(input, clusters=self.clusters, n_init=self.n_init)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py:414 __call__
        self._initialize(args, kwds, add_initializers_to=initializer_map)
    /var/folders/6p/0r05_kf55273nh_nftm5q9tw0000gn/T/tmp0zjq0l_w.py:14 KMeansBase *
        base_mat = ag__.converted_call('zeros', tf, ag__.ConversionOptions(recursive=True, force_conversion=False, optional_features=(), internal_convert_user_code=True), ((input_mat.shape[0], input_mat.shape[1] * input_mat.shape[2]),), None)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py:1880 zeros
        shape = ops.convert_to_tensor(shape, dtype=dtypes.int32)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1087 convert_to_tensor
        return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1145 convert_to_tensor_v2
        as_ref=False)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:1224 internal_convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py:305 _constant_tensor_conversion_function
        return constant(v, dtype=dtype, name=name)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py:246 constant
        allow_broadcast=True)
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py:284 _constant_impl
        allow_broadcast=allow_broadcast))
    ~/fluoro/fenv/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py:467 make_tensor_proto
        nparray = np.array(values, dtype=np_dt)

    TypeError: __int__ returned non-int (type NoneType)

My main question is: can TensorFlow not handle iterating over an unknown batch size, or am I missing some functionality?

Thank you for the help!

jmd-0 commented 5 years ago

Hi @gowthamkpr , any advice? Thank you for your help!

pandrey-fr commented 5 years ago

Hi, I tested in 2.0-rc0 (with Eager disabled) instead of 1.14 but intuitively the results should be similar.

In TF2, an issue would come from using input_mat.shape[0], which will return None, when you probably should use tf.shape(input_map)[0], which will return a dynamic scalar tensor pointing to inputs' actual batch size. But this might not be the case in 1.14, and at any rate, to answer your question, _in general, it is possible to handle dynamic batchsize, typically with a tf.while_loop (in your case, AutoGraph is generating one based on your code).

That being said, I think in your case the core failure reason is that you are attempting to feed a (symbolic) tensor to a scikit-learn object suited to use numpy arrays. This would perhaps work with Eager enabled on EagerTensors (which will implicitly be transformed back-and-forth into numpy arrays), but not (I think) on symbolic ones.

In my humble opinion, you probably should implement (or find) a tensorflow version of the KMeans algorithm and use it (or, possibly, find an alternative clustering layer that is easier to write and might have some learnable weights). At any rate, I hope that this helps a bit, and that you will find a suiting solution soon!

gowthamkpr commented 5 years ago

@jmd-0 Did @pandrey-fr answer solve your problem. Can I close the issue?

jmd-0 commented 5 years ago

Hi @pandrey-fr, thank you for your response! I tried to experiment with the different TF functions you suggested and was unable to do exactly what I wanted, so it seems that I should just try to use the TF implementation of the KMeans algorithm, as you suggested.

Also, thank you for the interesting tidbit about how TF generates the symbolic tensors for unknown dataset sizes. I need to be aware of that in the future.

Thanks again, @pandrey-fr

tensorflow-bot[bot] commented 5 years ago

Are you satisfied with the resolution of your issue? Yes No

pandrey-fr commented 5 years ago

You are very welcome ; good luck with the follow-up work :-)

menon92 commented 5 years ago

I'm facing same problem. Any one solve this kind of problem ?

Andreasksalk commented 4 years ago

Any updates on this? I need a custom layer that does a for loop.

menon92 commented 4 years ago

@Andreasksalk

Layer does not support iteration. you have to perform operation on entire batch

Andreasksalk commented 4 years ago

So it is a completely no go to use packages that is not keras? I have to perform optimal transport on a set of tensors in a keras model. i have found a way to do this on a single input but cannot seem to find a way to implement this into the keras model.

pandrey-fr commented 4 years ago

@Andreasksalk It depends on what you want to perform, and more specifically when you want it to happen. The overall issue is very simple: in order to train your model through backpropagation, tensorflow needs to be able to track what happens to the data, and hence to compute the gradients of the loss function relative to the model's trainable weights. But if you want to operate on your input data (before any trainable weights are being used), you should be able to do it - probably using a Lambda layer, and setting it to run eagerly.

ashwanikumar04 commented 3 years ago

I am trying to implement a custom Keras layer which does random shear. I am trying to use tf.keras.preprocessing.image.random_shear which implements random_shear per image. So, I have to iterate over the tensor and call this method for each input. However, the input is of the shape (None, 32,32,3), thus I can't know the number of rows. I tried to use tf.shape(inputs)[0] but it did not help.

Any other way can we do this?

pandrey-fr commented 3 years ago

Hello @ashwanikumar04 , I think you are looking for tf.map_fn (sheared = tf.map_fn(tf.keras.preprocessing.image.random_shear, inputs))