What is the decoder architecture of the qubvel segmentation_models?

qubvel / segmentation_models

Segmentation models with pretrained backbones. Keras and TensorFlow Keras.

MIT License

4.67k stars 1.03k forks source link

What is the decoder architecture of the qubvel segmentation_models? #500

Open ananya0409 opened 2 years ago

ananya0409 commented 2 years ago

I got the clear-cut idea of the encoder side of the segmentation_models. I am using Resnet 152 as the UNet backbone. But I am unclear about the decoder architecture. What configuration of kernel size, number of filters, stride and upsampling layers are used in the decoder side of the UNet model? Can anyone please help me? decoder decoder1

da2r-20 commented 2 years ago

The kernels are set to 3x3 as they are in the original paper. Take a look at Conv3x3BnReLU it is being used in various levels of sm.Unet

in terms of decoder params here is the sm.Unet constructor params:

def Unet(
        backbone_name='vgg16',
        input_shape=(None, None, 3),
        classes=1,
        activation='sigmoid',
        weights=None,
        encoder_weights='imagenet',
        encoder_freeze=False,
        encoder_features='default',

        decoder_block_type='upsampling',
        decoder_filters=(256, 128, 64, 32, 16),
        decoder_use_batchnorm=True,

        **kwargs
):

you can control the number of decoder filters and the decoder upsampling type

What parameter do you call sm.Unet() with? Can you share your model instantiation code?

Hope I understood your question...

ananya0409 commented 2 years ago

Yes, sir. You understood my query. Thank you so much for your response. It cleared my doubt. However, what is the number of strides used at each decoder side's Conv2D? Is stride=1 set at the decoder side conv2D? Here is my model instantiation code below: model1 = Unet(backbone_name='resnet152', encoder_weights='imagenet', encoder_freeze=True) model1.compile('Adam', total_loss, metrics=metrics) model1.summary()

da2r-20 commented 2 years ago

So you are using the upsampling decoder_block by default. This is your decoder block definition Evantually it is all keras layers. Conv3x3BnReLU initializes a Conv2D with default stride of (1, 1)

def DecoderUpsamplingX2Block(filters, stage, use_batchnorm=False):
    up_name = 'decoder_stage{}_upsampling'.format(stage)
    conv1_name = 'decoder_stage{}a'.format(stage)
    conv2_name = 'decoder_stage{}b'.format(stage)
    concat_name = 'decoder_stage{}_concat'.format(stage)

    concat_axis = 3 if backend.image_data_format() == 'channels_last' else 1

    def wrapper(input_tensor, skip=None):
        x = layers.UpSampling2D(size=2, name=up_name)(input_tensor)

        if skip is not None:
            x = layers.Concatenate(axis=concat_axis, name=concat_name)([x, skip])

        x = Conv3x3BnReLU(filters, use_batchnorm, name=conv1_name)(x)
        x = Conv3x3BnReLU(filters, use_batchnorm, name=conv2_name)(x)

        return x

    return wrapper

ananya0409 commented 2 years ago

Yes, I got it. Thank you. And one last question at the end.

segmentation_models.Unet(backbone_name='vgg16', input_shape=(None, None, 3), classes=1, activation='sigmoid', weights=None, encoder_weights='imagenet', encoder_freeze=False, encoder_features='default', decoder_block_type='upsampling', decoder_filters=(256, 128, 64, 32, 16), decoder_use_batchnorm=True, **kwargs)

encoder_features – a list of layer numbers or names starting from top of the model. Each of these layers will be concatenated with corresponding decoder block. If default is used layer names are taken from DEFAULT_SKIP_CONNECTIONS.

Can't we set encoder_features to something else other than 'default' so that configuration of skip connections is changed and it is not in default mode anymore?