keras-team / keras

Deep Learning for humans
Apache License 2.0
61.63k stars 19.42k forks source link

Deserializing Error when loading models from '.keras' files in Keras 3, issue with dense layers #20128

Open ErrolDaRocha opened 1 month ago

ErrolDaRocha commented 1 month ago

I am using Google Colab with the Tensorflow v2.17 and Keras v 3.4.1 libraries.

I need to save and load my models, but I haven't been able to make the '.keras' file format load correctly.

Here is the line for saving the model:, 'model_' + model_name + '.keras')) Here is the line for loading the model: model = keras.models.load_model(os.path.join(model_path, 'model_' + model_name + '.keras'), custom_objects=custom_objects)

This is my error:

ValueError                                Traceback (most recent call last)
[<ipython-input-9-882590e77519>](https://localhost:8080/#) in <cell line: 10>()
      9 # Load the model
---> 10 model = keras.models.load_model(os.path.join(model_path, 'model_' + model_name + '.keras'), custom_objects=custom_objects)

3 frames
[/usr/local/lib/python3.10/dist-packages/keras/src/saving/](https://localhost:8080/#) in _raise_loading_failure(error_msgs, warn_only)
    454         warnings.warn(msg)
    455     else:
--> 456         raise ValueError(msg)

ValueError: A total of 2 objects could not be loaded. Example error message for object <Dense name=z_mean, built=True>:

Layer 'z_mean' expected 2 variables, but received 0 variables during loading. Expected: ['kernel', 'bias']

List of objects that could not be loaded:
[<Dense name=z_mean, built=True>, <Dense name=z_log_var, built=True>]

This is the model that I trained:

latent_dim = 32

# Encoder
encoder_input = Input(shape=(height, width, channels), name='encoder_input')
x = Conv2D(64, (3, 3), activation='relu', padding='same')(encoder_input)

# Flatten layer
shape_before_flattening = K.int_shape(x)[1:]
x = Flatten()(x)

z_mean = Dense(latent_dim, name='z_mean')(x)
z_log_var = Dense(latent_dim, name='z_log_var')(x)

# Reparameterization trick
def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim), mean=0., stddev=1.0)
    return z_mean + K.exp(z_log_var / 2) * epsilon

z = Lambda(sampling, output_shape=(latent_dim,), name='z')([z_mean, z_log_var])

# Decoder
decoder_input = Input(K.int_shape(z)[1:])
x = Dense(
x = Reshape(shape_before_flattening)(x)
decoder_output = Conv2D(channels, (3, 3), activation='sigmoid', padding='same')(x)

class CustomLayer(keras.layers.Layer):
    def __init__(self, beta=1.0,  **kwargs):
        self.is_placeholder = True
        super(CustomLayer, self).__init__(**kwargs)
        self.beta = beta
        self.recon_loss_metric = tf.keras.metrics.Mean(name='recon_loss')
        self.kl_loss_metric = tf.keras.metrics.Mean(name='kl_loss')

    def vae_loss(self, x, z_decoded, z_mean, z_log_var):
        recon_loss = keras.losses.binary_crossentropy(K.flatten(x), K.flatten(z_decoded))
        kl_loss = -0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
        return recon_loss, self.beta * kl_loss

    def call(self, inputs):
        x = inputs[0]
        z_decoded = inputs[1]
        z_mean = inputs[2]
        z_log_var = inputs[3]
        recon_loss, kl_loss = self.vae_loss(x, z_decoded, z_mean, z_log_var)
        self.add_loss(K.mean(recon_loss + kl_loss))
        return x

    def compute_output_shape(self, input_shape):
        return input_shape[0]

    def get_metrics(self):
        return {'recon_loss': self.recon_loss_metric.result().numpy(),
                'kl_loss': self.kl_loss_metric.result().numpy()}

# Models
encoder = Model(encoder_input, [z_mean, z_log_var, z], name='encoder')
decoder = Model(decoder_input, decoder_output, name='decoder')
vae_output = decoder(encoder(encoder_input)[2])
y = CustomLayer()([encoder_input, vae_output, z_mean, z_log_var])
model = Model(encoder_input, y, name='vae')

This model was just used for testing the bug. I have used tf.keras as an alternative for loading the model, but I received the same error. Interestingly, when I run the code for the first time, this is included in the error output. When The same code is run again, the line is no longer included:

/usr/local/lib/python3.10/dist-packages/keras/src/saving/ UserWarning: Skipping variable loading for optimizer 'adam', because it has 30 variables whereas the saved optimizer has 22 variables. 

I have tested the code on the latest Keras v3.5, and have gotten similiar results:

/usr/local/lib/python3.10/dist-packages/keras/src/saving/ UserWarning: Skipping variable loading for optimizer 'adam', because it has 30 variables whereas the saved optimizer has 22 variables. 
ValueError                                Traceback (most recent call last)
[<ipython-input-9-00610835a4a5>](https://localhost:8080/#) in <cell line: 10>()
      9 # Load the model
---> 10 model = keras.models.load_model(os.path.join(model_path, 'model_' + model_name + '.keras'), custom_objects=custom_objects)

3 frames
[/usr/local/lib/python3.10/dist-packages/keras/src/saving/](https://localhost:8080/#) in _raise_loading_failure(error_msgs, warn_only)
    591         warnings.warn(msg)
    592     else:
--> 593         raise ValueError(msg)

ValueError: A total of 2 objects could not be loaded. Example error message for object <Dense name=z_mean, built=True>:

Layer 'z_mean' expected 2 variables, but received 0 variables during loading. Expected: ['kernel', 'bias']

List of objects that could not be loaded:
[<Dense name=z_mean, built=True>, <Dense name=z_log_var, built=True>]

I have tested the bug again by saving and loading the model into separate weights and json files:

# saving
with open(os.path.join(model_path, 'model_' + model_name + '.json'), 'w') as json_file:
model.save_weights(os.path.join(model_path, 'model_' + model_name + '.weights.h5'))

# loading
with open(os.path.join(model_path, 'model_' + model_name + '.json'), 'r') as json_file:
    model_json =
model = model_from_json(model_json, custom_objects=custom_objects)
model.load_weights(os.path.join(model_path, 'model_' + model_name + '.weights.h5'))

The error is at least slightly different:

/usr/local/lib/python3.10/dist-packages/keras/src/saving/ UserWarning: Skipping variable loading for optimizer 'adam', because it has 34 variables whereas the saved optimizer has 22 variables. 
ValueError                                Traceback (most recent call last)
[<ipython-input-14-52bd158e3e0f>](https://localhost:8080/#) in <cell line: 11>()
      9     model_json =
     10 model = model_from_json(model_json, custom_objects=custom_objects)
---> 11 model.load_weights(os.path.join(model_path, 'model_' + model_name + '.weights.h5'))
     13 # Load the encoder architecture and weights

1 frames
[/usr/local/lib/python3.10/dist-packages/keras/src/saving/](https://localhost:8080/#) in _raise_loading_failure(error_msgs, warn_only)
    591         warnings.warn(msg)
    592     else:
--> 593         raise ValueError(msg)

ValueError: A total of 3 objects could not be loaded. Example error message for object <Conv2D name=conv2d, built=True>:

Layer 'conv2d' expected 2 variables, but received 0 variables during loading. Expected: ['kernel', 'bias']

List of objects that could not be loaded:
[<Conv2D name=conv2d, built=True>, <Dense name=z_mean, built=True>, <Dense name=z_log_var, built=True>]

Ultimately it would be a lot better to find out that I've been doing something wrong and I can fix this problem myself. I've been hung up on this for awhile, and I have a thesis to write.

mehtamansi29 commented 4 weeks ago

Hi @ErrolDaRocha -

Thanks for reporting the issue. Here you can save and load the model in .h5 format rather then keras. It will load the custom layers and Dense layer properly without error.

# saving
import os
model_path= '/content/drive/MyDrive/issue'
model_name= 'custom_layer_model', 'model_' + model_name + '.h5'))

# loading
new_model_path= '/content/drive/MyDrive/issue/model_custom_layer_model.h5'
loaded_model = keras.models.load_model(new_model_path)

Attached the gist for your reference.

mehtamansi29 commented 3 weeks ago

Hi @ErrolDaRocha -

Are you still able to reproduce this issue ?

ghsanti commented 1 week ago


Demo here works