keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.58k stars 19.42k forks source link

How to handle grayscale images when using fine-tune with VGG16? #20031

Closed mearcla closed 1 week ago

mearcla commented 1 month ago

Hi keras team!

I am trying to use pre trained VGG16 with SRGAN and grayscale images when I use the VGG16 in this way:

def build_vgg():
    input_shape = (256, 256, 3)
    # Load a pre-trained VGG19 model trained on 'Imagenet' dataset

    vgg = VGG19(weights="imagenet",include_top=False, 
     input_shape=input_shape)
    vgg.trainable = False
return Model(inputs=vgg.inputs, outputs=vgg.layers[10].output)

my model work without any problem and gave grayscale images but when trying to frees the layers of VGG16 and train some layes like this code:

def build_vgg():
    """
    Build the VGG network to extract image features
    """
    input_shape = (256, 256, 3)
    vgg = VGG19(weights='imagenet', include_top=False, input_shape=input_shape)

    # Set the layers to be frozen
    for layer in vgg.layers[:8]:
        layer.trainable = False

    # Set the layers to be trainable
    for layer in vgg.layers[8:]:
        layer.trainable = True

    output_layer = vgg.get_layer('block5_conv4').output
    input_layer = vgg.input
    print(vgg.summary())
    model = Model(input_layer, output_layer)
    return model

The generated images be like have blue filter and some time brown filter or all generated images in blue. why this happen and how can I fine tune the VGG16 with gray images??

emi-dm commented 1 month ago

How are you using the data? Are you repeating the same channel three times? What is your finally task (classify, segment)? There are two ways :

  1. Give the original data (without repeat the depth channel) and connect the Input layer to Conv2D layer (using 3 filters as output)
  2. Repeat the last channel three times and give them to the model as input data.

Take in advanced, that your model's output is now an intermediate convolutional representation ( output_layer = vgg.get_layer('block5_conv4').output) not a final representation that you could use for a useful task.

I hope that my answer can help you :)

mearcla commented 1 month ago

@sachinprasadhs , thank you for your answer. I used VGG16 with SRGAN to enhance the image and I repeated the channel 3 times but why it works without problem and generate gray images when freeze all layers of vgg ?

like this code?

def build_vgg():
    input_shape = (256, 256, 3)
    # Load a pre-trained VGG19 model trained on 'Imagenet' dataset

    vgg = VGG19(weights="imagenet",include_top=False, input_shape=input_shape)
     vgg.trainable = False
    output_layer = vgg.get_layer('block3_conv3').output
    print(vgg.summary())
    input_layer = vgg.input
    model = Model(input_layer, output_layer)
    return model

And I have another question please, why when in some code for the VGG16 they used ??

output_layer = vgg.get_layer('block3_conv3').output
Or 
output_layer = vgg.get_layer('block4_conv2').output
Or 
output_layer = vgg.get_layer('block5_conv3').output
emi-dm commented 1 month ago

Hi, first of all, I'm going to respond to your second question. In this case, sometimes you may want to use only a certain part of the network (normally, the network without the top layer is used as a backbone to extract features), so this is the main reason!

Regarding the first question, I don't have any idea why it is giving you a tensor with more than one depth channel. Please, comment here with the output shape of the last layer of your VGG16 model! :)

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 1 week ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 1 week ago

Are you satisfied with the resolution of your issue? Yes No