keras-team / keras-core

A multi-backend implementation of the Keras API, with support for TensorFlow, JAX, and PyTorch.
Apache License 2.0
1.27k stars 115 forks source link

Saving broken between tf.keras and Keras Core #855

Closed tirthasheshpatel closed 12 months ago

tirthasheshpatel commented 12 months ago

Saving doesn't work between tf.keras and keras_core. More specifically, weights saved in keras_core don't load in tf.keras and vice versa.

Here's a MRE:

Saving Script:

import keras_core as keras

class MyModel(keras.models.Model):
    def __init__(self, **kwargs):
        x = keras.Input((None, None, 3))
        x_out = keras.layers.Conv2D(1, 1)(x)
        super().__init__(inputs=x, outputs=x_out, **kwargs)

model = MyModel()
model.summary()
model.save_weights("model.weights.h5")

Model summary before saving:

Model: "my_model"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Layer (type)                       ┃ Output Shape                  ┃     Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ input_layer (InputLayer)           │ (None, None, None, 3)         │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ conv2d (Conv2D)                    │ (None, None, None, 1)         │           4 │
└────────────────────────────────────┴───────────────────────────────┴─────────────┘
 Total params: 4 (16.00 B)
 Trainable params: 4 (16.00 B)
 Non-trainable params: 0 (0.00 B)

Loading script:

import tensorflow as tf
from tensorflow import keras

tf.debugging.disable_traceback_filtering()

class MyModel(keras.models.Model):
    def __init__(self, **kwargs):
        x = keras.Input((None, None, 3))
        x_out = keras.layers.Conv2D(1, 1)(x)
        super().__init__(inputs=x, outputs=x_out, **kwargs)

model = MyModel()
model.summary()
model.load_weights("model.weights.h5")

Model summary before loading:

Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, None, 3)]   0         

 conv2d (Conv2D)             (None, None, None, 1)     4         

=================================================================
Total params: 4 (16.00 Byte)
Trainable params: 4 (16.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Error:

Traceback (most recent call last):
  File "/home/tirthasheshpatel/oss/keras-cv/mre.py", line 15, in <module>
    model.load_weights("model.weights.h5")
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 61, in error_handler
    return fn(*args, **kwargs)
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/engine/training.py", line 3132, in load_weights
    return saving_api.load_weights(
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/saving/saving_api.py", line 267, in load_weights
    saving_lib.load_weights_only(
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/saving/saving_lib.py", line 323, in load_weights_only
    _load_state(
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/saving/saving_lib.py", line 456, in _load_state
    _load_container_state(
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/saving/saving_lib.py", line 513, in _load_container_state
    _load_state(
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/saving/saving_lib.py", line 425, in _load_state
    trackable.load_own_variables(weights_store.get(inner_path))
  File "/home/tirthasheshpatel/oss/virtualenvs/keras-dev/lib/python3.10/site-packages/keras/src/engine/base_layer.py", line 3539, in load_own_variables
    raise ValueError(
ValueError: Layer 'conv2d' expected 2 variables, but received 0 variables during loading. Expected: ['conv2d/kernel:0', 'conv2d/bias:0']

Versions:

Keras Core version: 0.1.5
TensorFlow Keras version: 2.13.1
fchollet commented 12 months ago

Can you print model.summary() on each side (before saving/loading)?

tirthasheshpatel commented 12 months ago

Here's the output of model summary.

Model summary before saving:

Model: "my_model"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Layer (type)                       ┃ Output Shape                  ┃     Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ input_layer (InputLayer)           │ (None, None, None, 3)         │           0 │
├────────────────────────────────────┼───────────────────────────────┼─────────────┤
│ conv2d (Conv2D)                    │ (None, None, None, 1)         │           4 │
└────────────────────────────────────┴───────────────────────────────┴─────────────┘
 Total params: 4 (16.00 B)
 Trainable params: 4 (16.00 B)
 Non-trainable params: 0 (0.00 B)

Model summary before loading:

Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, None, 3)]   0         

 conv2d (Conv2D)             (None, None, None, 1)     4         

=================================================================
Total params: 4 (16.00 Byte)
Trainable params: 4 (16.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
fchollet commented 12 months ago

This is probably caused by a naming discrepancy. We should fix it. @nkovela1 do you have cycles to take a look?

fchollet commented 12 months ago

Also -- as a workaround, you can do weights = model.get_weights(); new_model.set_weights(weights). This will work.

fchollet commented 12 months ago

I found the cause -- basically some change in TF seems to have broken the save file namespace in tf.keras (introduction of a _layer_checkpoint_dependencies path which overrides layers. We have to fix this on the tf.keras side. Things are nominal in Keras Core.

nkovela1 commented 12 months ago

Sure thing, I just created a Colab to inspect the h5 file and found that same discrepancy with _layer_checkpoint_dependencies: https://colab.sandbox.google.com/drive/1Ir1AQp6DUtYXk-nomRVgjM11ukTPXnnt

fchollet commented 12 months ago

This is now fixed at HEAD but the issue will persist in TF 2.13 and TF 2.14. Use get_weights()/set_weights() as a workaround.

tirthasheshpatel commented 12 months ago

Sounds good, thanks for the quick fix!

AYREB commented 9 months ago

I'm not following. Would you mind elaborating on where I should insert the code for the get_weights()/set_weights() workaround. I am using a script to load and use the model and that's wear the issue is happening.

`import keras import numpy as np import tensorflow as tf from keras.models import load_model import matplotlib.pyplot as plt

class_names = ['apple', 'banana', 'beetroot', 'bell pepper', 'cabbage', 'capsicum', 'carrot', 'cauliflower', 'chilli pepper', 'corn', 'cucumber', 'eggplant', 'garlic', 'ginger', 'grapes', 'jalepeno', 'kiwi', 'lemon', 'lettuce', 'mango', 'onion', 'orange', 'paprika', 'pear', 'peas', 'pineapple', 'pomegranate', 'potato', 'raddish', 'soy beans', 'spinach', 'sweetcorn', 'sweetpotato', 'tomato', 'turnip', 'watermelon']

image_path = "download.jpg"

model = keras.models.load_model("TrainedModel.keras")

img = tf.keras.utils.load_img( image_path, target_size=(180, 180) ) img_array = tf.keras.utils.img_to_array(img) img_array = tf.expand_dims(img_array, 0) # Create a batch

predictions = model.predict(img_array) score = tf.nn.softmax(predictions[0])

print( "This image most likely belongs to {} with a {:.2f} percent confidence." .format(class_names[np.argmax(score)], 100 * np.max(score)) ) `