NASNet misleading ValueError when include_top=False and weights

Waffleboy commented 5 years ago

Setting a different image size with include_top = False and weights = 'imagenet' as below,

base_model_transfer = NASNetLarge(input_shape=(image_shape[0],image_shape[1],3), include_top=False,\
                         weights='imagenet', input_tensor=None, \
                         pooling='max', classes=train_generator.num_classes)

I obtain a ValueError ValueError: When settinginclude_top=Trueand loadingimagenetweights,input_shapeshould be (331, 331, 3).

Seems strange when include_top is set to False. Should this not work?

taehoonlee commented 5 years ago

@Waffleboy, the architecture of NASNet differs according to input_shape. Specifically, the line 513 (elif p_shape[img_dim] != ip_shape[img_dim]) causes structural changes, because the zero-padding in 2x down-sampling (conv, pool) perform differently according to whether input_shape is even or odd.

Thus, NASNet(weights='imagenet') is now forced to take only input_shape=(331, 331, 3) to restore the pretrained weights that depend on the architecture.

Waffleboy commented 5 years ago

@taehoonlee Ah okay, if i want to train NASNet from scratch, as well via transfer learning on a 7 class dataset, how would you recommend me to do so?

taehoonlee commented 5 years ago

@Waffleboy,

Training from scratch: keras.applications.NASNetMobile(weights=None, input_shape=(128, 128, 3), classes=7),
Transfer learning: the section "Fine-tune InceptionV3 on a new set of classes" in the official docs.

Waffleboy commented 5 years ago

@taehoonlee thanks for your reply!

I cant seem to get it to download the imagenets with weight model. the link https://github.com/titu1994/Keras-NASNet/'%20'releases/download/v1.2/NASNet-large.h5 does not work, and even in the code it default loads the no-top version instead. Is there something Im doing wrong?

taehoonlee commented 5 years ago

@Waffleboy, Could you share your codes?

Waffleboy commented 5 years ago

Sure thing @taehoonlee ,

image_shape = (331,331)
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        vertical_flip = True,
        data_format='channels_last')

test_datagen = ImageDataGenerator(rescale=1./255,data_format='channels_last')

experiment_folder = base_folder+experiment_name

train_generator = train_datagen.flow_from_directory(
        '{}/train'.format(experiment_folder),
        target_size=image_shape,
        batch_size=batch_size,
        class_mode='categorical')

base_model = NASNetLarge(input_shape=(image_shape[0],image_shape[1],3), include_top=True,\
                         input_tensor=None, weights=None, classes=train_generator.num_classes)

Also the transfer learning bit doesn't seem to quite work - it doesn't compile as it says

First training
Traceback (most recent call last):
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1576, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 4 for 'metrics/top_3_accuracy/in_top_k/InTopKV2' (op: 'InTopKV2') with input shapes: [?,11,11,4032], [?,?,?], [].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "project.py", line 231, in <module>
    optimizer='adam',metrics=['categorical_accuracy',top_3_accuracy])
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/keras/engine/training.py", line 440, in compile
    handle_metrics(output_metrics)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/keras/engine/training.py", line 409, in handle_metrics
    mask=masks[i])
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/keras/engine/training_utils.py", line 403, in weighted
    score_array = fn(y_true, y_pred)
  File "project.py", line 178, in top_3_accuracy
    return top_k_categorical_accuracy(y_true, y_pred, k=3)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/keras/metrics.py", line 43, in top_k_categorical_accuracy
    return K.mean(K.in_top_k(y_pred, K.argmax(y_true, axis=-1), k), axis=-1)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 3397, in in_top_k
    return tf.nn.in_top_k(predictions, targets, k)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 2692, in in_top_k
    return gen_nn_ops.in_top_kv2(predictions, targets, k, name=name)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 4228, in in_top_kv2
    "InTopKV2", predictions=predictions, targets=targets, k=k, name=name)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
    return func(*args, **kwargs)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
    op_def=op_def)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1731, in __init__
    control_input_ops)
  File "/home/s1885554/miniconda3/envs/mlp_tf/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1579, in _create_c_op
    raise ValueError(str(e))
ValueError: Shape must be rank 2 but is rank 4 for 'metrics/top_3_accuracy/in_top_k/InTopKV2' (op: 'InTopKV2') with input shapes: [?,11,11,4032], [?,?,?], [].

For reference: this is the code I was using for transfer learning:


base_model_transfer = NASNetLarge(input_shape=(image_shape[0],image_shape[1],3), include_top=False,\
                             weights='imagenet', input_tensor=None)
filename = 'transfer_model'
current = base_model_transfer
out = current.output
out = GlobalAveragePooling2D()(out)
# add one fc
#out = Dense(35,activation='relu')(out)
preds = Dense(train_generator.num_classes,activation= 'softmax')(out)
transfer_model = Model(inputs=current.input,outputs=preds)
for layer in current.layers:
        layer.trainable = False

print("First training")
# train for a bit
parallel_model = multi_gpu_model(current, gpus=6)
parallel_model.compile(loss='categorical_crossentropy', 
                       optimizer='adam',metrics=['categorical_accuracy',top_3_accuracy]) # CRASHES HERE

parallel_model.fit_generator(
            train_generator,
            steps_per_epoch=50,
            epochs=2,
            callbacks = callbacks,
            validation_data=validation_generator,
            validation_steps=30)

print("Second training")
# freeze layers
freeze = 3 #num last layers to freeze
#freeze = freeze + 1 # for zero indexing
model_layer_length = len(transfer_model.layers)

for layer in transfer_model.layers[:model_layer_length - freeze]:
    layer.trainable = False
for layer in transfer_model.layers[freeze:]:
    layer.trainable = True

from keras.optimizers import SGD
parallel_model = multi_gpu_model(transfer_model, gpus=6)
parallel_model.compile(loss='categorical_crossentropy',
                           optimizer=SGD(lr=0.0001, momentum=0.9),metrics=['categorical_accuracy',top_3_accuracy])

parallel_model.fit_generator(
            train_generator,
            steps_per_epoch=100,
            epochs=50,
            callbacks = callbacks,
            validation_data=validation_generator,
            validation_steps=80)

 parallel_model.save_weights('{}.h5'.format(filename))

taehoonlee commented 5 years ago

@Waffleboy, You should change parallel_model = multi_gpu_model(current, gpus=6) to parallel_model = multi_gpu_model(transfer_model, gpus=6).

Waffleboy commented 5 years ago

Hey, thanks for your help!

I've been trying various stuff over the past few days, and wonder if theres an issue with the weights of NASnet? It performs horribly (about the same as chance) even after training from scratch and via transfer learning

taehoonlee commented 5 years ago

@Waffleboy, The NASNet weights are fine.

juice500ml commented 5 years ago

Docs are somewhat misleading. It should say something about imagenet weights not usable for different input shapes. https://keras.io/applications/#nasnet

sreenivasaupadhyaya commented 5 years ago

@Waffleboy

Hey did you solve the image net loading issue due to mismatch of size ( your original issue ) After reading some blogs and few posts, I found out that not all the keras models listed in keras.applications webpage as present in the keras implementation officially yet.

So may be in they are released in the coming versions.

Srini

PCdurham commented 5 years ago

Back to NASnet, we've been getting great results from both Large and Mobile using transfer learning but with image sizes of only 50x50. This change in the Keras causes issues for us. As noted above, when you try to download NASnet for the first time on a new install, you get the 331 tile size bug mentioned because of the enforced 331 size. But on machines where the download already exists, everything works fine. We found that if you change nasnet.py, line 167 to

require_flatten=False

The code runs once again. Is this technically correct or is it problematic?

P

LOGHORIZION commented 5 years ago

Thanks a lot!

doraeric commented 4 years ago

@taehoonlee

I found that Inception model took include_top as require_flatten when calling _obtain_input_shape in this commit, is it okay to apply the change to NASNet? It raises an error when trying transfer learning because require_flatten is set to True now.

mazatov commented 4 years ago

Wonder the same thing @doraeric . As I understand the difference comes when the input image size is even and not odd, correct? So shouldn't we be able to load the weights as long as our input size is odd? I currently downloaded separately the weights NasNet-large-no-top.h5 and do something like this. I am not sure if the weights are being loaded correctly though. The same code doesn't spit out an error even when the input shape is even 🤷‍♂

input_shape = (199,99,3)
base_model = NASNetLarge(weights = None, include_top=False, input_shape = input_shape)
base_model.load_weights('NASNet-large-no-top.h5')

Most of my images are really small, under (150x75) pixels. So it seems crazy to resize them all to (331,331), but so far NasNetlarge model with (331,331) input size is the best performer on my dataset by large. I'd just rather make it work on a smaller resolution to save computational time.

rafalfirlejczyk commented 4 years ago

thanks #mazatov . Loading of the NASNetmobile model and than the weights from previously downloaded 'NASNet-mobile-no-top.h5' solved my issue.

arturo-opsetmoen-amador commented 3 years ago

Back to NASnet, we've been getting great results from both Large and Mobile using transfer learning but with image sizes of only 50x50. This change in the Keras causes issues for us. As noted above, when you try to download NASnet for the first time on a new install, you get the 331 tile size bug mentioned because of the enforced 331 size. But on machines where the download already exists, everything works fine. We found that if you change nasnet.py, line 167 to

require_flatten=False

The code runs once again. Is this technically correct or is it problematic?

P

Hi,

I got the same error in tf 2.4.1. I changed line 177 (the one that was 167 for you) to require_flatten=False. Did you get any answer on the potential dangers of changing this?

kalelpark commented 2 years ago

I recommend that everyone refer to this link!

https://www.tensorflow.org/api_docs/python/tf/keras/applications/nasnet/NASNetLarge

keras-team / keras-applications

NASNet misleading ValueError when include_top=False and weights #78