keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.97k stars 19.46k forks source link

model save /load not working properly #6073

Closed pianoman4873 closed 7 years ago

pianoman4873 commented 7 years ago

I have the following code -

def BuildMatrixFactorizationModel(usersCount, qaPairsCount): left = Sequential() left.add(Embedding(usersCount, EMBEDDING_DIM, input_length=1)) left.add(Reshape((EMBEDDING_DIM,)))

right = Sequential()
right.add(Embedding(qaPairsCount, EMBEDDING_DIM, input_length=1))
right.add(Reshape((EMBEDDING_DIM,)))

model = Sequential()
model.add( Merge ([left, right], mode='dot'))
model.compile(loss='mse', optimizer='adamax' , metrics= ['acc'] )
return model

if name == 'main': path = "e:/model.h5" model = BuildMatrixFactorizationModel(100000,5000) model.save(path) del model model = load_model(path)

However , load_model fails with the error message -

File "C:\Users\pianoman\AppData\Local\Programs\Python\Python35\lib\site-packages\keras\engine\topology.py", line 2877, in load_weights_from_hdf5_group str(len(filtered_layers)) + ' layers.') ValueError: You are trying to load a weight file containing 2 layers into a model with 0 layers.

Any idea what is wrong ?

Note that I was able to train the model - only loading of the presisted model fails.

dhilton commented 7 years ago

I had to reformat your code so please correct it if I've made a mistake:

from keras.models import Sequential
from keras.layers import Embedding
from keras.layers.core import Reshape
from keras.layers import Merge
from keras.models import load_model

EMBEDDING_DIM=25

def BuildMatrixFactorizationModel(usersCount, qaPairsCount):
    left = Sequential()
    left.add(Embedding(usersCount, EMBEDDING_DIM, input_length=1))
    left.add(Reshape((EMBEDDING_DIM,)))

    right = Sequential()
    right.add(Embedding(qaPairsCount, EMBEDDING_DIM, input_length=1))
    right.add(Reshape((EMBEDDING_DIM,)))

    model = Sequential()
    model.add( Merge ([left, right], mode='dot'))
    model.compile(loss='mse', optimizer='adamax' , metrics= ['acc'] )
    return model

path = "model.h5"
model = BuildMatrixFactorizationModel(100000,5000)
model.save(path)
del model
model = load_model(path)

The error you're seeing is because your model has only one layer that has no weights. The load_model function calls in the function load_weights_from_hd5_group a test that each layer has a weight and does an equivalence check between the layers with weights and all layers in your model.

If you look at the abstract base type of a layer (https://github.com/fchollet/keras/blob/master/keras/engine/topology.py#L179 ) a layer has to implement a get_weights and a set_weights method.

I believe that your issue is related to #3927 - for some reason the merge layer has no weights. But the merged layers (left & right in your example) do have weights which is why the error message references two layers. Sorry I couldn't help more.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

VjunetXuuftofi commented 7 years ago

Sorry to reopen this, but I wanted to know if there has been any change or if anybody has any suggestions. I'm having a similar issue with a different (but also nonstandard) architecture. My model:

lstm_dim = 400
output_dim = max_length

main_input = Input((None, 3), name='main_input')
main_input_masked = Masking(mask_value=-1)(main_input)

hidden_input = Input(shape=(1,), name='hidden_input')

hidden_state = Dense(lstm_dim)(hidden_input)
hidden_state = LeakyReLU()(hidden_state)

c_state = Dense(lstm_dim)(hidden_input)
c_state = LeakyReLU()(c_state)

lstm_encoder, hidden, c = LSTM(lstm_dim, return_state=True, implementation=2)(main_input_masked, 
                                                                              initial_state=[hidden_state,
                                                                                             c_state])
output = Dense(2, activation="softmax")(lstm_encoder)

model = Model(inputs=[main_input, hidden_input], outputs=[output])

model.compile(loss="categorical_crossentropy", optimizer=Adam(lr=0.002), metrics ["binary_accuracy"])

Model compiles, fits, and predicts fine. It appears to save fine using model.save(). But when loaded, the following error occurs:

ValueError: Layer lstm_1 expects 1 inputs, but it received 3 input tensors. Input received: [<tf.Tensor 'masking_1_2/mul:0' shape=(?, ?, 3) dtype=float32>, <tf.Tensor 'leaky_re_lu_1_2/sub:0' shape=(?, 400) dtype=float32>, <tf.Tensor 'leaky_re_lu_2_2/sub:0' shape=(?, 400) dtype=float32>]

I tried saving the architecture and the weights separately, but loading the architecture from json did not work. The architecture looked like this:

{"class_name": "Model", "config": {"name": "model_1", "layers": [{"name": "hidden_input", "class_name": "InputLayer", "config": {"batch_input_shape": [null, 1], "dtype": "float32", "sparse": false, "name": "hidden_input"}, "inbound_nodes": []}, {"name": "main_input", "class_name": "InputLayer", "config": {"batch_input_shape": [null, null, 3], "dtype": "float32", "sparse": false, "name": "main_input"}, "inbound_nodes": []}, {"name": "dense_1", "class_name": "Dense", "config": {"name": "dense_1", "trainable": true, "units": 400, "activation": "linear", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}, "inbound_nodes": [[["hidden_input", 0, 0, {}]]]}, {"name": "dense_2", "class_name": "Dense", "config": {"name": "dense_2", "trainable": true, "units": 400, "activation": "linear", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}, "inbound_nodes": [[["hidden_input", 0, 0, {}]]]}, {"name": "masking_1", "class_name": "Masking", "config": {"name": "masking_1", "trainable": true, "mask_value": -1}, "inbound_nodes": [[["main_input", 0, 0, {}]]]}, {"name": "leaky_re_lu_1", "class_name": "LeakyReLU", "config": {"name": "leaky_re_lu_1", "trainable": true, "alpha": 0.30000001192092896}, "inbound_nodes": [[["dense_1", 0, 0, {}]]]}, {"name": "leaky_re_lu_2", "class_name": "LeakyReLU", "config": {"name": "leaky_re_lu_2", "trainable": true, "alpha": 0.30000001192092896}, "inbound_nodes": [[["dense_2", 0, 0, {}]]]}, {"name": "lstm_1", "class_name": "LSTM", "config": {"name": "lstm_1", "trainable": true, "return_sequences": false, "return_state": true, "go_backwards": false, "stateful": false, "unroll": false, "implementation": 2, "units": 400, "activation": "tanh", "recurrent_activation": "hard_sigmoid", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "recurrent_initializer": {"class_name": "Orthogonal", "config": {"gain": 1.0, "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "unit_forget_bias": true, "kernel_regularizer": null, "recurrent_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "recurrent_constraint": null, "bias_constraint": null, "dropout": 0.0, "recurrent_dropout": 0.0}, "inbound_nodes": [[["masking_1", 0, 0, {}], ["leaky_re_lu_1", 0, 0, {}], ["leaky_re_lu_2", 0, 0, {}]]]}, {"name": "dense_3", "class_name": "Dense", "config": {"name": "dense_3", "trainable": true, "units": 2, "activation": "softmax", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}, "inbound_nodes": [[["lstm_1", 0, 0, {}]]]}], "input_layers": [["main_input", 0, 0], ["hidden_input", 0, 0]], "output_layers": [["dense_3", 0, 0]]}, "keras_version": "2.0.6", "backend": "tensorflow"}

Is there any workaround for this issue, or do I have to implement this kind of architecture directly in TensorFlow?

phimit commented 7 years ago

exact same problem here, with Keras 2.0.6 and TF 1.3 here's the general structure of the model, basically a merge of 2 LSTMs

model1 = Sequential() model1.add(embed1) model1.add(rnn_shared) model1.add(BatchNormalization())

model2 = Sequential() model2.add(embed2) model2.add(rnn_shared) model2.add(BatchNormalization())

model = Sequential() model.add(Merge([model1,model2],mode="concat")) model.add(Dropout(params["dropout"])) model.add(Dense(1, activation='sigmoid'))

genimind commented 7 years ago

Keras 2.0.8 release includes a fix to a similar issue in regard to loading models with RNN that include initial states.