Add a support of `include_top=False` for ResNeXt

taehoonlee commented 5 years ago

Currently, it is not possible to call ResNeXt50(include_top=False) because the reshape operation inside the ResNeXt requires static input shapes. This PR replaces the reshape with the lambda with conv2d in order to enable the case.

Test codes are:

import numpy as np

from keras import backend as K
from keras.preprocessing import image
from keras.applications.resnext import ResNeXt50
from keras.applications.resnext import preprocess_input, decode_predictions

img = image.load_img('cat.png', target_size=(256, 256))
x = image.img_to_array(img)[16:-16, 16:-16]
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

model = ResNeXt50(weights='imagenet', include_top=True)
print(decode_predictions(model.predict(x), top=3)[0])

# Before this PR, both two raise the following error:
# ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.
model2 = ResNeXt50(weights='imagenet', include_top=False, input_shape=(None, None, 3))
model3 = ResNeXt50(weights='imagenet', include_top=False, input_shape=None)

outs1 = K.function([model.layers[0].input], [model.layers[-3].output])([x])[0]
outs2 = model2.predict(x)
outs3 = model3.predict(x)

np.testing.assert_allclose(outs1, outs2)
np.testing.assert_allclose(outs2, outs3)

# After this PR, arbitrary input shapes are OK.
print(model2.predict(np.random.random((1, 224, 224, 3))).shape)
print(model2.predict(np.random.random((1, 400, 400, 3))).shape)
print(model3.predict(np.random.random((1, 500, 500, 3))).shape)

Results are:

[('n02112018', 'Pomeranian', 0.13667941), ('n02123394', 'Persian_cat', 0.06406643), ('n02124075', 'Egyptian_cat', 0.05045045)]
(1, 7, 7, 2048)
(1, 13, 13, 2048)
(1, 16, 16, 2048)

taehoonlee commented 5 years ago

@fchollet, Do you have any progress on the handling ResNeXt's breaking changes recently discussed in emails? This PR enables ResNeXt50(include_top=False).

d02k01 commented 5 years ago

@taehoonlee, I have converted pre-trained weights according to this review.

Scripts are here.

Converted weights are 5 times size larger than current files. So, I think using Split (not implemented yet) and Add like this is a more better way.

d02k01 commented 5 years ago

I have confirmed that Conv2D and this method are the same.

import keras
import numpy as np
import tensorflow as tf

if __name__ == '__main__':
    np.random.seed(42)

    filters = 128
    groups = 32
    c = filters // groups
    input_shape = (None, None, filters * c)

    # using `Conv2D`
    inputs = keras.layers.Input(input_shape)
    kernel = np.zeros((1, 1, filters * c, filters), dtype='f')
    for i in range(filters):
        start = (i // c) * c * c + i % c
        end = start + c * c
        kernel[:, :, start:end:c, i] = 1
    x = keras.layers.Lambda(
        lambda x: keras.backend.conv2d(x, keras.backend.variable(kernel))
    )(inputs)
    model0 = keras.Model(inputs, x)

    # using 'tf.split', `tf.expand_dims` and `tf.unstack`
    inputs = keras.layers.Input(input_shape)
    x = keras.layers.Lambda(lambda i: tf.split(i, filters, axis=-1))(inputs)
    x = [
        keras.layers.Lambda(lambda i: tf.expand_dims(i, axis=-1))(i) for i in x]

    x_list = []
    for i in range(groups):
        start = i * c
        end = (i + 1) * c

        tmp = keras.layers.concatenate(x[start:end], axis=-1)
        tmp = keras.layers.Lambda(lambda x: tf.unstack(x, axis=-1))(tmp)
        tmp = keras.layers.add(tmp)
        x_list.append(tmp)

    x = keras.layers.concatenate(x_list, axis=-1)
    model1 = keras.Model(inputs, x)

    # verify
    x = np.random.rand(1, 224, 224, input_shape[-1]).astype('f')
    out0 = model0.predict(x)
    out1 = model1.predict(x)
    assert np.array_equal(out0, out1)

I think this flow split -> (expand_dims ->) concat -> unstack -> add -> concat is the best way. If you (@taehoonlee, @fchollet) agree, I will implement new Keras layer.

taehoonlee commented 4 years ago

@ocjosen, Sorry for the late response. There were three options in this discussion.

Using Lambda with a np array originally proposed in the PR: @fchollet concerns the bad serializability of Lambda.
Using Conv2D with updating weight files by proposed @fchollet: @ocjosen helped the update, but it resulted in too heavy weight files (5 times larger than now).
Adding a new layer with split, ..., concat by proposed @ocjosen: It will work, but is not good in my humble opinion. If you add a custom layer into resnet_common.py, it is hard to serialize the custom layer like Lambda layers. Or if you add a new layer into the master branch of Keras, our ResNeXt will always require the latest Keras. I think keras-applications can provide compatibility as wide as possible.

Thus, I proposed another option "Using Conv2D without updating weight files". It will be serializable because it is composed of a regular layer and a regular initializer. How about now @fchollet? I look forward to hearing from you.

taehoonlee commented 4 years ago

I would really appreciate your help, and I'm really sorry again for my late response. @ocjosen.

keras-team / keras-applications

Add a support of `include_top=False` for ResNeXt #85