keras-team / keras-hub

Modular Natural Language Processing workflows with Keras
Apache License 2.0
762 stars 228 forks source link

When loading a saved model, if keras_nlp is not imported, an untrained model is created #717

Open elbamos opened 1 year ago

elbamos commented 1 year ago

Describe the bug I built a model using the TokenAndPositionEmbedding and TransformerEncoder layers, then saved the model. If I reload the model after importing keras_nlp, everything is fine. If, however, I reload the model without importing keras_nlp, I get a model that produces much worse performance on the same data. The restored model performs like a model that has never been trained. It is as if - and this is what I think is happening - the weights for something inside the TransformerEncoder layers are not being restored, and instead new weights are being created.

The reason this is particularly problematic is that I would like to deploy my trained model on Vertex, where I don't think keras_nlp is available.

To Reproduce ReproductSaveIssue.ipynb.zip

Expected behavior The model should be restored properly.

Additional context This issue may be related: https://github.com/keras-team/keras-nlp/issues/219

Would you like to help us fix it?

How can I help?

elbamos commented 1 year ago

@jbischof wanted to make sure you saw this, its a minimally reproducible example of the saving issue I tried to raise a few days ago.

mattdangerw commented 1 year ago

Here's a colab version of the repro -> https://colab.research.google.com/gist/mattdangerw/9e14379d76991bf61687348cbebb957b/reproductsaveissue.ipynb

This is a good catch! We don't have testing for saving a model in an environment where keras_nlp is available, and then loading in an environment where it is not.

I am not totally sure what the correct fix here is, let us get back to you! At the highest level, we do want our models easily usable with Vertex, so we should make sure we have a good path of support there.

elbamos commented 1 year ago

Am I wrong that if a compiled keras model is saved with traces, it should be runnable on another system without the python code for the modules? The model does correctly convert to tflite in portable form.

mattdangerw commented 1 year ago

Am I wrong that if a compiled keras model is saved with traces, it should be runnable on another system without the python code for the modules? The model does correctly convert to tflite in portable form.

Yes that is correct, though the best way to get a model that is purely "traced" would probably be with the low level tf.saved_model.save API. This will get you a fully traced graph without any of the python code. model.save() is actually a bit of a hybrid, it will attempt to revive python objects and some model traces on top, so for model.save(), having the KerasNLP library imported in the deployment environment is probably a good idea.

Random note, this confusion is a big pain point, and there is active work at the core Keras level to clean up this picture. There was some info about future directions in our latest community meeting -> https://drive.google.com/file/d/1wWg_5Eu0ODQhXdLPMPpqt2EiiBg2um1O/view?usp=sharing

So hopefully soon a less confusing picture around all of this!

jayam30 commented 1 year ago

Hey! I want to solve this bug