keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.15k stars 19.49k forks source link

Unable to load custom initializer from the saved model, passing custom_objects is not working #3867

Closed armancohan closed 3 years ago

armancohan commented 8 years ago

I have a simple custom initializer in the model. When I try to load the model, I get invalid initialization error. I saw similar issues where the suggested solution was to pass custom_objects argument to the load_model function. However, this did not work for me. This is the code to reproduce the problem:

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
from keras.models import model_from_json
from keras.models import load_model
from keras import initializations

def my_init(shape, name=None):
    return initializations.normal(shape, scale=0.01, name=name)

model = Sequential()
model.add(Dense(output_dim=64, input_dim=100, init=my_init))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("relu"))

# compile the model 
model.compile(loss='categorical_crossentropy', optimizer='sgd')
print("Compilation OK!")

# load the model from json string
model.save('model.h5')
del model
model = load_model('model.h5', custom_objects={'my_init':my_init})
print("Load OK!")

Running it throws error at the load_model line:

Exception: Invalid initialization: my_init

I also tried saving and loading with json using the model_from_json function, but the same issue appears.

dnola commented 8 years ago

I am having the same problem. Are there any updates on this?

Gwyki commented 8 years ago

I came across this problem today.

A fix is to add your custom init function to the 'initialization' module after it is defined:

setattr(initializations, 'my_init', my_init)

This doesn't fully decouple the model from the code as it doesn't allow for any init specific parameters to be serialised out. If it is worth it to extend the initialization design to allow that (think like the activation functions constraints are currently handled) is debatable.

edit: my slip up

PeterChe1990 commented 8 years ago

As in #1634, mock the get function can also be a possible solution.

rousseau commented 7 years ago

Same issue here. However, using setattr does not solve the problem.

Furthermore, I don't understand why when loading a model, the initialization matters.

tuming1990 commented 7 years ago

I found a workaround to this problem. When you add layers to your model, the "weights" parameters can be initialized with a numpy array. Then you can use numpy to do random initialization, such as:

def weights_initialization(inputDim, outputDim, scale = 0.1):
    return numpy.sqrt(scale)*numpy.random.randn(inputDim, outputDim), \
        numpy.sqrt(scale) * numpy.random.randn(outputDim),
joeyearsley commented 7 years ago

https://github.com/fchollet/keras/pull/5012

from keras.utils.generic_utils import get_custom_objects
metrics = CustomLosses()
get_custom_objects().update({"dice_coef_class_loss": metrics.dice_coef_class_loss})
alxy commented 7 years ago

I am having the exactly same issue. I try to pass my custom initializer via custom_objects, but it does not appear to be recognized. Is there any update on this?

alxy commented 7 years ago

For anyone interested: The solution @joeyearsley presented works as long as working with a (callable) class. In the case of an initializer function I get TypeError: custom_initializer() missing 1 required positional argument: 'shape'

My workaround to this situation was to introduce a callable class, which did finally work:

class CustomInitializer:
    def __call__(self, shape, dtype=None):
        return custom_initializer(shape, dtype=dtype)

get_custom_objects().update({'custom_initializer': CustomInitializer})`

model = load_model("../weights/custom_init_test.hdf5")

Still I think this shouild be considered a bug, cause the most natural interface/solution would be to use model = load_model('model.h5', custom_objects={'my_init':my_init})

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

dnola commented 7 years ago

I agree this should be treated as a bug. It is important and a fairly common use case!

xiaoyongzhu commented 7 years ago

@alxy Can you share the full piece of code? I have the similar problem and tried this solution, but looks like it's not serializable. The error message is:

    self.vgg_model = vgg_from_t7(vgg_weights, target_layer='relu4_1')
  File "/datadrive/xiaoyzhu/StyleTransfer/AdaIN-TF/vgg_normalised.py", line 75, in vgg_from_t7
    json_string = model.to_json()
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/engine/topology.py", line 2668, in to_json
    model_config = self._updated_config()
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/engine/topology.py", line 2635, in _updated_config
    config = self.get_config()
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/engine/topology.py", line 2329, in get_config
    layer_config = layer.get_config()
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/layers/convolutional.py", line 462, in get_config
    config = super(Conv2D, self).get_config()
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/layers/convolutional.py", line 221, in get_config
    'kernel_initializer': initializers.serialize(self.kernel_initializer),
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/initializers.py", line 478, in serialize
    return serialize_keras_object(initializer)
  File "/datadrive/xiaoyzhu/python3env/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 112, in serialize_keras_object
    raise ValueError('Cannot serialize', instance)
ValueError: ('Cannot serialize', <vgg_normalised.CustomInitializer object at 0x7f58ac0cd8d0>)

and The code I am using is:

def custom_initializer(shape, dtype=None):
    return K.constant(0, shape = shape, dtype=dtype)

class CustomInitializer:
    def __call__(self, shape, dtype=None):
        return custom_initializer(shape, dtype=dtype)

get_custom_objects().update({'custom_initializer': CustomInitializer})
def vgg_from_t7(t7_file, target_layer=None):
          ... loadl weights from torch model

            x = Conv2D(filters, kernel_size, padding='valid', activation=None, name=name,
                        kernel_initializer=custom_initializer,
                        bias_initializer=custom_initializer,
                        trainable=False)(x)
          .... assuming the model is loaded successfully 

    # Hook it up
    model = Model(inputs=inp, outputs=x)
    model.save("savedmodel.hdf5")
    print("model saved")
    model = load_model("savedmodel.hdf5")
    print("model loaded")
alxy commented 7 years ago

@xiaoyongzhu Do you think this is really related? The error is different from the one I was getting. However, I would think that get_custom_objects().update({'custom_initializer': CustomInitializer}) immediatly overrides custom_initializer, so try moving this to a location after the model has been saved.

Also note that keras changed quite a few things, as there have been new versions since then. What I found the most useful solution was to only load the weights, and not the entire model. If you still have the model definition, it is far easier to just do model.load_weights("savedmodel.hdf5") and you don't have to bother with custom objects. That assumes model is already defined with all its layers and initializers.

jnorthrup commented 5 years ago

i have seen this error posted in several places on the internet, and has been fixed in tensorflowjs but not keras or tf python.

my model is culled from early-stopping callback, im not saving it manually. this appears to be common

Traceback (most recent call last): File "/home/jim/mlcc-exercises/rejuvepredictor/stage4.py", line 175, in custom_objects={'kernel_initializer':GlorotUniform} File "/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py", line 225, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File "/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py", line 458, in model_from_config return deserialize(config, custom_objects=custom_objects) File "/usr/local/lib/python3.6/dist-packages/keras/layers/init.py", line 55, in deserialize printable_module_name='layer') File "/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py", line 145, in deserialize_keras_object list(custom_objects.items()))) File "/usr/local/lib/python3.6/dist-packages/keras/engine/sequential.py", line 300, in from_config custom_objects=custom_objects) File "/usr/local/lib/python3.6/dist-packages/keras/layers/init.py", line 55, in deserialize printable_module_name='layer') File "/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object return cls.from_config(config['config']) File "/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py", line 2298, in from_config return cls(*config) File "/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py", line 2178, in init implementation=implementation) File "/usr/local/lib/python3.6/dist-packages/keras/layers/recurrent.py", line 1841, in init self.kernel_initializer = initializers.get(kernel_initializer) File "/usr/local/lib/python3.6/dist-packages/keras/initializers.py", line 508, in get return deserialize(identifier) File "/usr/local/lib/python3.6/dist-packages/keras/initializers.py", line 503, in deserialize printable_module_name='initializer') File "/usr/local/lib/python3.6/dist-packages/keras/utils/generic_utils.py", line 138, in deserialize_keras_object ': ' + class_name) ValueError: Unknown initializer: GlorotUniform

swiss-knight commented 5 years ago

Same error as the one mentioned by jnorthrup here. My keras version is; '2.2.4-tf'

jnorthrup commented 5 years ago

i don't recall where i encountered this fix but i believe this helped get past the blocker.

from tensorflow.python.keras.initializers import glorot_uniform ...

with CustomObjectScope({'GlorotUniform': glorot_uniform()}):
    model = load_model(
        MODEL_PREFIX + '.hdf5')  # compile_model(inmetrics=None, dropout_=dropout_, bias_regularizer_=regularize)
model.summary(print_fn=print)
# print(pd.DataFrame(final.history).describe())
plot_prediction(model)
swiss-knight commented 5 years ago

I solved my issue, I noticed that I had these imports:

import os, sys
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorboard as tb
from tensorflow import keras
from tensorflow.keras import Sequential
from tensorflow.keras import initializers
from keras.models import load_model

by digging into the .__file__ and/or .__path__ attribute of any keras modules, I finally noticed that the last line of my imports was actually calling the keras standalone module and not the one embeded in tensorflow, which I naively thought was having the priority over the keras standalone because of my from tensorflow import keras import which is a few lines before. But it's not the case (probably related to how python decide to explore its own paths)!

By changing my keras imports to purely tensorflow ones and tensorflow ones only:

import os, sys
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
import tensorboard as tb
from tensorflow import keras
# always load keras stuff from tf...
from tensorflow.keras import Sequential
from tensorflow.keras import initializers
from tensorflow.keras.models import load_model

(notice the last line changes)

I do not meet the error anymore!

/!\ So, to me, it seems that some of the keras submodules doesn't behave exactly the same than the ones embeded in tf.keras (I first thought there was kind of automated pushes from keras to tf.keras and so I have not to care about that... but actually it seem that I must care!). For example here, the glorot_uniform call was correctly linked to "GlorotUniform" within tensorflow but not in keras itself.

Hope it helps.