usnistgov / image-classification-resnet50

Tensorflow 2.x Image Classification ResNet50 Model
Other
7 stars 4 forks source link

Model Flops Discrepancy #3

Open Woodyet opened 1 year ago

Woodyet commented 1 year ago

Hi

I have been running some tests and your model reports a FLOPS of around 4 G

The original paper and the keras implementation reports 3.8 G

Any idea why the difference?

Woodyet commented 1 year ago

Found the error it's in the conv block definition chnaged here very easy fix

def _conv_block(input, filters, stride, use_l2_regularizer):
        filter1, filter2, filter3 = filters

        x = tf.keras.layers.Conv2D(
            filters=filter1,
            kernel_size=1,
            strides=stride,
            padding='same',
            activation=None,
            use_bias=False,
            kernel_initializer='he_normal',
            kernel_regularizer=ResNet50._gen_l2_regularizer(use_l2_regularizer),
            data_format='channels_last')(input)
        x = tf.keras.layers.BatchNormalization(axis=1)(x)
        x = tf.keras.layers.Activation('relu')(x)

        x = tf.keras.layers.Conv2D(
            filters=filter2,
            kernel_size=3,
            strides=1,
            padding='same',
            activation=None,
            use_bias=False,
            kernel_initializer='he_normal',
            kernel_regularizer=ResNet50._gen_l2_regularizer(use_l2_regularizer),
            data_format='channels_last')(x)
        x = tf.keras.layers.BatchNormalization(axis=1)(x)
        x = tf.keras.layers.Activation('relu')(x)

        x = tf.keras.layers.Conv2D(
            filters=filter3,
            kernel_size=1,
            strides=1,
            padding='same',
            activation=None,
            use_bias=False,
            kernel_initializer='he_normal',
            kernel_regularizer=ResNet50._gen_l2_regularizer(use_l2_regularizer),
            data_format='channels_last')(x)
        x = tf.keras.layers.BatchNormalization(axis=1)(x)
        x = tf.keras.layers.Activation('relu')(x)

        shortcut = tf.keras.layers.Conv2D(
            filters=filter3,
            kernel_size=1,
            strides=stride,
            padding='same',
            activation=None,
            use_bias=False,
            kernel_initializer='he_normal',
            kernel_regularizer=ResNet50._gen_l2_regularizer(use_l2_regularizer),
            data_format='channels_last')(input)
        shortcut = tf.keras.layers.BatchNormalization(axis=1)(shortcut)

        x = tf.keras.layers.add([x, shortcut])
        x = tf.keras.layers.Activation('relu')(x)
        return x

The variable stride was in the wrong conv def :)