google / automl

Google Brain AutoML
Apache License 2.0
6.25k stars 1.45k forks source link

Weight shape inconsistency between TF-Hub and GitHub models #1132

Closed nss-ysasaki closed 2 years ago

nss-ysasaki commented 2 years ago

I have a model trained using a TF-Hub version of EffNetV2-M (https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_ft1k_m/feature_vector/2), and I want to load the model with this repository's effnetv2_model.EffNetV2Model. I am doing this because I need the output of intermediate layers (for model explanation purposes), which TF-Hub API does not provide access to.

Then I noticed that shapes of weights from TF-Hub and GitHub models differ, so I am unable to transfer the model from TF-Hub to GitHub;

What is causing this shape mismatch? Was the models published on TF-Hub trained with hyperparams different from what this repository offers?


The code to extract shapes is (roughly) as follows:

pretrained_model = hub.KerasLayer( CORE_LAYER_PATH, trainable=True, input_shape=[*IMAGE_SIZE, 3], load_options=load_locally)

model = tf.keras.Sequential([ tf.keras.layers.Lambda( lambda data: tf.image.convert_image_dtype(data, tf.float32), input_shape=[*IMAGE_SIZE, 3]), pretrained_model, tf.keras.layers.Dense(len(CLASSES), activation='softmax') ])

Print weight names and shapes

names = [weight.name for layer in model.layers for weight in layer.weights] weights = {name: weight for name, weight in zip(names, model.get_weights())}

for name in sorted(weights.keys()): print(f"{name} ({weights[name].shape})")


* GitHub
```python

import preprocessing
import datasets
import effnetv2_model
import effnetv2_configs
import hparams

model_name = 'efficientnetv2-m'
num_classes = 7
dataset_cfg = 'customdatasetclass7'
IMAGE_SIZE = 1024
hparam_str = "model.num_classes={},data.num_classes={},eval.isize={}".format(
    num_classes, num_classes, IMAGE_SIZE)

def get_config(model_name, dataset_cfg, hparam_str=''):
    """Create a keras model for EffNetV2."""
    config = copy.deepcopy(hparams.base_config)
    config.override(effnetv2_configs.get_model_config(model_name))
    config.override(datasets.get_dataset_config(dataset_cfg))
    config.override(hparam_str)
    config.model.num_classes = config.data.num_classes

    return config

tf.keras.backend.clear_session()
config = get_config(model_name, dataset_cfg, hparam_str=hparam_str)
model = effnetv2_model.EffNetV2Model(
    model_name=model_name, model_config=config.model)
cfg = model.cfg
model(tf.ones([1, IMAGE_SIZE, IMAGE_SIZE, 3], tf.float32), training=False)

## Print weight names and shapes
names = [weight.name for layer in model.layers for weight in layer.weights]
weights = {name: weight for name, weight in zip(names, model.get_weights())}

for name in sorted(weights.keys()):
    print(f"{name} ({weights[name].shape})")
nss-ysasaki commented 2 years ago

False alarm. It was a bug in the weight conversion process. The weight shape of the GitHub model I've shown above was wrong. It is consistent with the TF-Hub model, like such:

blocks_1
blocks_1/efficientnetv2-m
blocks_1/efficientnetv2-m/blocks_1
blocks_1/efficientnetv2-m/blocks_1/conv2d
blocks_1/efficientnetv2-m/blocks_1/conv2d/kernel:0, (3, 3, 24, 24)
blocks_1/efficientnetv2-m/blocks_1/tpu_batch_normalization
blocks_1/efficientnetv2-m/blocks_1/tpu_batch_normalization/beta:0, (24,)
blocks_1/efficientnetv2-m/blocks_1/tpu_batch_normalization/gamma:0, (24,)
blocks_1/efficientnetv2-m/blocks_1/tpu_batch_normalization/moving_mean:0, (24,)
blocks_1/efficientnetv2-m/blocks_1/tpu_batch_normalization/moving_variance:0, (24,)