keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62k stars 19.48k forks source link

How to properly load custom layer from config? #1927

Closed EderSantana closed 8 years ago

EderSantana commented 8 years ago

Please make sure that the boxes below are checked before you submit your issue. Thank you!

How do I use model_from_json if I have custom layer. Here it says that we just have to insert it to globals: https://github.com/fchollet/keras/blob/master/keras/utils/layer_utils.py#L23-L24

I did import the layer, but it didn't work for me. Am I missing something? Do they mean anything else with "insert to globals"?

around1991 commented 8 years ago

Yeah, when you call model_from_json, set the custom_objects kwarg to be a dictionary with the keys being the custom layer name and the value being the custom layer itself.

In my case, I have a FeedForwardAttention layer, so I have to do

model = model_from_json(config, custom_objects={'FeedForwardAttention': FeedForwardAttention})

Hope this helps Kris

EderSantana commented 8 years ago

@around1991 it did work! Thank you very much!

visheshmistry commented 7 years ago

I did the exact thing as mentioned above but still got the following error:

(what I did) model=load_model('model.h5',custom_objects={"SpatialTransformer": SpatialTransformer()})

(error) Traceback (most recent call last): File "stn_test.py", line 119, in model=load_model('model.h5',custom_objects={"SpatialTransformer": SpatialTransformer()}) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/models.py", line 240, in load_model model = model_from_config(model_config, custom_objects=custom_objects) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/models.py", line 304, in model_from_config return layer_module.deserialize(config, custom_objects=custom_objects) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/layers/init.py", line 54, in deserialize printable_module_name='layer') File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 140, in deserialize_keras_object list(custom_objects.items()))) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/models.py", line 1202, in from_config layer = layer_module.deserialize(conf, custom_objects=custom_objects) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/layers/init.py", line 54, in deserialize printable_module_name='layer') File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 141, in deserialize_keras_object return cls.from_config(config['config']) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/topology.py", line 1231, in from_config return cls(**config) TypeError: init() missing 2 required positional arguments: 'localization_net' and 'output_size'

(SpatialTransformer.py) class SpatialTransformer(Layer): def init(self, localization_net, output_size, kwargs): self.locnet = localization_net self.output_size = output_size super(SpatialTransformer, self).init(kwargs) ...

I even tried:

model=load_model('model.h5',custom_objects={"SpatialTransformer": SpatialTransformer(localization_net=locnet,output_size=(30,30))})

but got the same error again

ankitmishra262 commented 7 years ago

Just send model=load_model('model.h5',custom_objects={"SpatialTransformer": SpatialTransformer})

visheshmistry commented 7 years ago

Thank you Ankit but I tried this too and it still did not work.

ankitmishra262 commented 7 years ago

Okay, I was getting the same issue today. If you have not done it, try updating keras to 2.0. Then follow the syntax given in https://keras.io/layers/writing-your-own-keras-layers/ for your layer.

This mostly should solve your issue, if that doesn't work, check about adding get_config in your custom layer declaration.

A workaround that I can suggest is to save the model as JSON, save the weights separately by save_weights, then when required load both of them. If you aren't clear how to do it, look at http://machinelearningmastery.com/save-load-keras-deep-learning-models/

salarjf commented 6 years ago

Just send model=load_model('model.h5',custom_objects={"SpatialTransformer": SpatialTransformer})

Thanks Ankit! Your solution helped me.

Just to complete your hint, 'custom_objects' also works with other model elements like custom loss functions and so. Salar

MingleiLI commented 6 years ago

Okay, I was getting the same issue today. If you have not done it, try updating keras to 2.0. Then follow the syntax given in https://keras.io/layers/writing-your-own-keras-layers/ for your layer.

This mostly should solve your issue, if that doesn't work, check about adding get_config in your custom layer declaration.

A workaround that I can suggest is to save the model as JSON, save the weights separately by save_weights, then when required load both of them. If you aren't clear how to do it, look at http://machinelearningmastery.com/save-load-keras-deep-learning-models/

It seems that there is nothing about how to write get_config in the links.

Monduiz commented 5 years ago

When using a custom layer, you will have to define a get_config function into the layer class. Example:

https://github.com/keras-team/keras/blob/master/keras/layers/convolutional.py#L214

This will show you how to adapt the get_config code to your custom layers.

With the example above:

 `def get_config(self):
        config = {
        'localization_net': self.localization_net,
        'output_size': self.output_size

        }
    base_config = super(SpatialTransformer, self).get_config()
    return dict(list(base_config.items()) + list(config.items()))`

You will need to retrain the model using the new class code.

offchan42 commented 5 years ago

@Monduiz I had a lot of problems with saving and loading Keras custom layers because keras forget my layer's fields. Why don't they mentioned get_config thing in their custom layer tutorial?

offchan42 commented 5 years ago

Also why keras prefer to use return dict(list(base_config.items()) + list(config.items())) instead of

config.update(base_config)
return config

Because this is a simpler code. Is there any other reason you know?

AlkaSaliss commented 5 years ago

@Monduiz I had a lot of problems with saving and loading Keras custom layers because keras forget my layer's fields. Why don't they mentioned get_config thing in their custom layer tutorial?

I confirm, I had tough time before finding the get_config trick, it would be nice if it is added in the doc that one might need to rewrite the get_config method in the custom layer.

AdnanRiaz107 commented 4 years ago

Yeah, when you call model_from_json, set the custom_objects kwarg to be a dictionary with the keys being the custom layer name and the value being the custom layer itself.

In my case, I have a FeedForwardAttention layer, so I have to do

model = model_from_json(config, custom_objects={'FeedForwardAttention': FeedForwardAttention})

Hope this helps Kris

Hi Dear! I am trying to do the same but I got the error."ValueError: Unknown Layer: Attention"

AlkaSaliss commented 4 years ago

@AdnanRiaz107 do you have a Layer called Attention in your code ? (also a small code snippet might help us to get where the problem comes from)

AdnanRiaz107 commented 4 years ago

@AdnanRiaz107 do you have a Layer called Attention in your code ? (also a small code snippet might help us to get where the problem comes from)

Yes , I have attention layer followed by a LSTM layer.

AlkaSaliss commented 4 years ago

And can u provide the code snippet you use for loading the model and that gives the error ?

AdnanRiaz107 commented 4 years ago

And can u provide the code snippet you use for loading the model and that gives the error ?

I have saved my model and now i try to reload it by using following method with CustomObjectScope({'AttentionLayer': Attention}): final_model = load_model("test_Att__-mse-rmse2ep_tl3.h5")

I got this error......... Traceback (most recent call last): File "C:/Users/adnan/PycharmProjects/Code/rerun.py", line 268, in final_model = load_model("test_Att__-mse-rmse2ep_tl3.h5") File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\engine\saving.py", line 260, in load_model model = model_from_config(model_config, custom_objects=custom_objects) File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\engine\saving.py", line 334, in model_from_config return deserialize(config, custom_objects=custom_objects) File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\layers__init.py", line 55, in deserialize printable_module_name='layer') File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object list(custom_objects.items()))) File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\engine\network.py", line 1017, in from_config process_layer(layer_data) File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\engine\network.py", line 1003, in process_layer custom_objects=custom_objects) File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\layers\init__.py", line 55, in deserialize printable_module_name='layer') File "C:\Users\adnan\Anaconda3\lib\site-packages\keras\utils\generic_utils.py", line 138, in deserialize_keras_object ': ' + class_name) ValueError: Unknown layer: Attention

Process finished with exit code 1


Here is my code class Attention(Layer): def init(self, step_dim, bias=True, W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None, **kwargs): self.supports_masking = True self.init = initializers.get('glorot_uniform') self.W_regularizer = regularizers.get(W_regularizer) self.b_regularizer = regularizers.get(b_regularizer)

        self.W_constraint = constraints.get(W_constraint)
        self.b_constraint = constraints.get(b_constraint)
        self.bias = bias
        self.step_dim = step_dim
        self.features_dim = 0
        super(Attention, self).__init__(**kwargs)

    def build(self, input_shape):
        assert len(input_shape) == 3

        self.W = self.add_weight((input_shape[-1],),
                                 initializer=self.init,
                                 name='{}_W'.format(self.name),
                                 regularizer=self.W_regularizer,
                                 constraint=self.W_constraint)
        self.features_dim = input_shape[-1]

        if self.bias:
            self.b = self.add_weight((input_shape[1],),
                                     initializer='zero',
                                     name='{}_b'.format(self.name),
                                     regularizer=self.b_regularizer,
                                     constraint=self.b_constraint)
        else:
            self.b = None

        self.built = True

    def compute_mask(self, input, input_mask=None):
        return None

    def call(self, x, mask=None):
        features_dim = self.features_dim
        step_dim = self.step_dim

        eij = K.reshape(K.dot(K.reshape(x, (-1, features_dim)), K.reshape(self.W, (features_dim, 1))),
                        (-1, step_dim))

        if self.bias:
            eij += self.b

        eij = K.tanh(eij)

        a = K.exp(eij)

        if mask is not None:
            a *= K.cast(mask, K.floatx())

        a /= K.cast(K.sum(a, axis=1, keepdims=True) + K.epsilon(), K.floatx())

        a = K.expand_dims(a)
        weighted_input = x * a
        return K.sum(weighted_input, axis=1)

    def compute_output_shape(self, input_shape):
        return input_shape[0], self.features_dim

def LSTMS(X, Y, epochs=30, validation_split=0.2, patience=10): speed_input = Input(shape=(X.shape[1], X.shape[2]), name='speed') main_output = LSTM(input_shape = (X.shape[1], X.shape[2]), output_dim = X.shape[2], return_sequences=False)(speed_input) main_output = Attention(3)(mainoutput) NAME = "1LSTM-mse-rmse{}".format(int(time.time())) tensorboard= TensorBoard(log_dir='logs/{}'.format(NAME), histogram_freq=150, batch_size=32, write_graph=False, write_grads=True )

final_model = Model(input=[speed_input], output=[main_output])
final_model.summary()
final_model.compile(loss='mse', optimizer='rmsprop' )
history = LossHistory()
earlyStopping = EarlyStopping(monitor='val_loss', min_delta=0.00001, patience=patience, verbose=0, mode='auto')
final_model.fit([X], Y, validation_split=0.2, epochs=epochs, callbacks=[history, earlyStopping, tensorboard])

return final_model, history

AlkaSaliss commented 4 years ago

I've copie-pasted your code but was unable to run it and train a model, I have shape errors. I can't figure out what the inputs/outputs look like. here the link to the colab notebook where I tried, can you check what is wrong with the code in the notebook and/or modify it to providing an example input to make it work so that I can try saving and loading ?