keras-team / keras-contrib

Keras community contributions
MIT License
1.58k stars 651 forks source link

Saving-Loading BILSTM-CRF model ValueError: Layer - loss missing #125

Closed iliaschalkidis closed 5 years ago

iliaschalkidis commented 7 years ago

I set up a BILSTM-CRF model for sequence labelling, very similar to this example (https://github.com/farizrahman4u/keras-contrib/blob/master/examples/conll2000_chunking_crf.py).

from keras.layers  import Dense, Masking
from keras.layers import Dropout
from keras.layers import Bidirectional
from keras.layers import LSTM
from keras.models import Sequential
from keras.optimizers import Adam
from keras_contrib.layers import CRF

self._model = Sequential(name='core_sequential')
self._model.add(Masking(mask_value=0., input_shape=(None, input_size)))

self._model.add(Dropout(dropout_rate,name='dropout_layer_1'))
self._model.add(Bidirectional(LSTM(units=hidden_unit_size,
                                     return_sequences=True,
                                     activation="tanh", 
                                     name="lstm_layer"),
                                     name='birnn_layer'))
crf = CRF(2, sparse_target=True)
self._model.add(crf)
self._model.compile(optimizer=Adam(lr=lr), loss=crf.loss_function, metrics=[crf.accuracy])

self._model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=epochs, batch_size=batch_size,
                            verbose=verbose, shuffle=shuffle)

self._model.save(filename)

The model was trained successfully and I also successfully called the predict_classes() function, as soon as I trained it and I had the object after training...

Then I tried to load it back, like it is mentioned in the wiki "A Common "Gotcha", importing the additional layer, before calling load_model() function :

from keras.models import load_model
from keras_contrib.layers import CRF

self._model = load_model(filename)

The error message is:

[...]
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 140, in deserialize_keras_object
    list(custom_objects.items())))
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/models.py", line 1202, in from_config
    layer = layer_module.deserialize(conf, custom_objects=custom_objects)
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 133, in deserialize_keras_object
    ': ' + class_name)
**ValueError: Unknown layer: CRF**

So, I changed to:

from keras_contrib.layers import CRF

self._model = load_model(filename, custom_objects={'CRF':CRF})

The new error says that the loss is missing also:

  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/engine/training.py", line 751, in compile
    loss_function = losses.get(loss)
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/losses.py", line 96, in get
    return deserialize(identifier)
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/losses.py", line 88, in deserialize
    printable_module_name='loss function')
  File "/Users/kiddo/anaconda/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 157, in deserialize_keras_object
    ':' + function_name)
**ValueError: ('Unknown loss function', ':loss')**

I tried extending the custom_objects dict with ':loss':CRF.CRF.loss_function', but I still have the same error...

Any idea about that?

ParthShah412 commented 7 years ago

I too have the same issue

ParthShah412 commented 7 years ago

Found a solution use from keras_contrib.utils import save_load_utils Save your model using save_load_utils.save_all_weights(model,filename)

Easily load it back, by just specifying the architecture how the model was built and load all the weights back by

save_load_utils.load_all_weights(model,filename)

I did it and it worked fine for me.

iliaschalkidis commented 7 years ago

Thanks a lot @ParthShah412 It works fine for me also! A bit tricky and sloppy though... I need to add an if statement in my loader() to discriminate the loading between other models from 'CRF'-extended ones :D

quincyliang commented 7 years ago

I use save_load_utils.load_all_weights(model,filename) to load the model and it works. However, the model.predict() function is quite slow, it takes around 400ms to predict a single input. How to speed up the prediction?

yzho0907 commented 6 years ago

thx guys. well, saving function was okay but once i try to load the model by using load_all_weights(model,filename), then, the error below throws: in load_all_weights topology.load_weights_from_hdf5_group(f['model_weights'], model.layers) AttributeError: 'NoneType' object has no attribute 'layers any idea how to fix?

lzfelix commented 6 years ago

@yzho0907 by your error message it seems that you are setting model = None and then invoking load_all_weights(...), although you first need to reconstruct your model instead.

Guys, the only way that I've found to make this solution to work was using the following snippet, where build_bidirectional_model builds the model on examples/conll2000_chunking_crf.py. Do you have any ideas on solutions or possible causes for that?

loaded_model = build_bidirectional_model(vocab_size, EMBEDDING_DIM,
                                        LSTM_OUTPUT_SIZE, amount_classes,
                                        compile=True)

save_load_utils.load_all_weights(loaded_model, STORED_MODEL_FILENAME,
                                 include_optimizer=False)

loaded_model.evaluate(...)

If I set include_optimzer=True I receive the following error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-80-b0de98ab256e> in <module>()
      2                                   100, amount_classes, True)
      3 
----> 4 saver.load_all_weights(loaded_model, TEST_FILE, True)
      5 
      6 # model2.compile('nadam', loss=crf.loss_function, metrics=[crf.accuracy])

~/miniconda3/envs/bicrf/lib/python3.6/site-packages/keras_contrib/utils/save_load_utils.py in load_all_weights(model, filepath, include_optimizer)
    106             optimizer_weight_values = [optimizer_weights_group[n] for n in
    107                                        optimizer_weight_names]
--> 108             model.optimizer.set_weights(optimizer_weight_values)

~/miniconda3/envs/bicrf/lib/python3.6/site-packages/keras/optimizers.py in set_weights(self, weights)
    111                              str(len(weights)) +
    112                              ') does not match the number of weights ' +
--> 113                              'of the optimizer (' + str(len(params)) + ')')
    114         weight_value_tuples = []
    115         param_values = K.batch_get_value(params)

ValueError: Length of the specified weight list (37) does not match the number of weights of the optimizer (0)
saxenarohit commented 6 years ago

How to use this using checkpoint in keras?

dterg commented 6 years ago

Same error here @lzfelix

mary-octavia commented 6 years ago

I can't even use save_load_utils. It's getting an import error

from keras.engine import saving

ImportError: cannot import name 'saving'

lzfelix commented 6 years ago

For reference, the solution posted on #129 seems to work, although #272 is addressing this problem.

mary-octavia commented 6 years ago

As an update, I managed to fix the saving import error by just updating keras (and modifying something in models.py). Now save_load_utils seems to work properly (.evaluate() gives me the same score on the same test data before and after loading...)

yzho0907 commented 6 years ago

@mary-octavia did u mean updating keras to the latest version? if not, which version? thx

lonelydancer commented 6 years ago

@mary-octavia same problem with keras '2.2.2',tf '1.10.1';

csJd commented 6 years ago

Following worked for me.

Save your model by using

model.save(filename)

or

model.save_weights(filename)

And you can load it back by model.load_weights(filename) after specifying the model architecture

dorothee commented 6 years ago

@csJd really? Did that work for you using crf from keras_contrib.layers?

csJd commented 6 years ago

@csJd really? Did that work for you using crf from keras_contrib.layers?

yes, keras.models.load_model(filename) not work, but model.load_weights(filename) worked for me

iliaschalkidis commented 6 years ago

If you just care using them for predictions (production) and not retraining them, the default dump() and load() functionalities are both working just fine for models including CRF layers based on the latest updates in the code using Keras 2.2.0 with just a naive "hack".

Example:

def fake_loss(y_true, y_pred):
    return 0

model.save(filename)
model = load_model(filename, custom_objects={'CRF': CRF, 'loss': fake_loss})
lzfelix commented 5 years ago

PR #318 should have fixed this problem. Please refer to the CRF new docs and example in the test folder.

gabrieldemarmiesse commented 5 years ago

Closing this issue as it seems resolved. Thanks @lzfelix .

GishnuChandran commented 5 years ago

Thanks a lot @ParthShah412 It works fine for me also! A bit tricky and sloppy though... I need to add an if statement in my loader() to discriminate the loading between other models from 'CRF'-extended ones :D

can you please share me how you solved the issue i am also facing thee same.

smb-maker commented 3 years ago

I solved the unknown crf layer using:- from keras_contrib.layers.crf import CRF, crf_loss, crf_viterbi_accuracy newmodel = load_model(model_name, custom_objects={"CRF": CRF, 'crf_loss': crf_loss, 'crf_viterbi_accuracy': crf_viterbi_accuracy}) It works good for me!!!