faustomorales / vit-keras

Keras implementation of ViT (Vision Transformer)
Apache License 2.0
336 stars 80 forks source link

RuntimeError: Unable to create link (name already exists) #13

Closed awsaf49 closed 3 years ago

awsaf49 commented 3 years ago

model or weights can't be saved, getting this error RuntimeError: Unable to create link (name already exists) . I'm guessing two weights got the same name from this Link

import tensorflow as tf
inp = tf.keras.layers.Input(shape=(256,256,3))
base = vit.vit_b16(
    image_size=256,
    pretrained=True,
    include_top=False,
    pretrained_top=False,
)
x = base(inp)
x = tf.keras.layers.Dense(64,,activation='relu')(x)
x = tf.keras.layers.Dense(5,, activation='softmax')(x)
model = tf.keras.Model(inputs=inp, outputs=x)
opt = tf.keras.optimizers.Adam()
loss = tf.keras.losses.CategoricalCrossentropy()
model.compile(optimizer=opt,loss=loss,metrics=['categorical_accuracy'])
model.save('ViT.h5')
faustomorales commented 3 years ago

I'm not sure about the reason for this error message. I don't think we're re-using any layer names. But it appears to work if you use the input layer created as part of the ViT model (see below).

import tensorflow as tf
import vit_keras.vit as vit
base = vit.vit_b16(
    image_size=256,
    pretrained=True,
    include_top=False,
    pretrained_top=False,
)
x = tf.keras.layers.Dense(64,activation='relu')(base.output)
x = tf.keras.layers.Dense(5, activation='softmax')(x)
model = tf.keras.Model(inputs=base.inputs, outputs=x)
opt = tf.keras.optimizers.Adam()
loss = tf.keras.losses.CategoricalCrossentropy()
model.compile(optimizer=opt,loss=loss,metrics=['categorical_accuracy'])
model.save('ViT.h5')

Alternatively, we could (and perhaps should) provide the user an option to supply their own input layer. Feedback welcome!

awsaf49 commented 3 years ago

@faustomorales well I think the name Dense_0 and Dense_1 is the reason behind this error. If you check the variable names you'll see that Dense has 12 duplicates. I'm currently using this workaround. I'm simply giving each Dense a unique name. I also had to make sure variable names do match with the weight. Though this is not the best but solves this issue.

class TransformerBlock(tf.keras.layers.Layer):
    """Implements a Transformer block."""
    def __init__(self, *args, num_heads, mlp_dim, dropout, n, **kwargs):
        super().__init__(*args, **kwargs)
        self.num_heads = num_heads
        self.mlp_dim = mlp_dim
        self.dropout = dropout
        self.n       = n

    def build(self, input_shape):
        self.att = MultiHeadSelfAttention(
            num_heads=self.num_heads,
            name=f"MultiHeadDotProductAttention_1",
        )
        self.mlpblock = tf.keras.Sequential(
            [
                tf.keras.layers.Dense(
                    self.mlp_dim, activation=tfa.activations.gelu, name=f"TB{self.n}_Dense_0"
                ),
                tf.keras.layers.Dropout(self.dropout),
                tf.keras.layers.Dense(input_shape[-1], name=f"TB{self.n}_Dense_1"),
                tf.keras.layers.Dropout(self.dropout),
            ],
            name=f"MlpBlock_3",

For matching the variable names with weight,

for match in matches:
    source_keys_used.extend(match["keys"])
    source_weights = [params_dict[k if not k.startswith('TB') else k[4:]] for k in match["keys"]]
    if match.get("reshape", False):
        source_weights = [
            source.reshape(expected.shape)
            for source, expected in zip(
                source_weights, match["layer"].get_weights()
            )
        ]
faustomorales commented 3 years ago

Thanks for sharing that fix -- I made a similar one in https://github.com/faustomorales/vit-keras/commit/e9a62ba7e89479ac025084f5b69e5a16b689a494 which is now published in v0.0.12. Please comment back if this doesn't resolve the issue. Thanks, again!