keras-team / tf-keras

The TensorFlow-specific implementation of the Keras API, which was the default Keras from 2019 to 2023.
Apache License 2.0
63 stars 31 forks source link

efficientnetBx model.save() fails due to serialization problem with tf2.10.0 #383

Closed jeromemassot closed 1 year ago

jeromemassot commented 2 years ago

System information.

Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.

Describe the current behavior. model save() fails and reports a serialization problem.

Describe the expected behavior. saving keras model without error.

Contributing.

Source code / logs.

WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 273). These functions will not be directly callable after loading. INFO:tensorflow:Assets written to: ./models/EfficientNetB7_Naiads.h5py\assets INFO:tensorflow:Assets written to: ./models/EfficientNetB7_Naiads.h5py\assets Output exceeds the size limit. Open the full output data in a text editor

TypeError Traceback (most recent call last) Cell In [31], line 1 ----> 1 model.save('./models/EfficientNetB7_Naiads.h5py')

File e:\02- Vision Projects\01- Naiads Projects\notebooks.venv\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.traceback) 68 # To get the full stack trace, call: 69 # tf.debugging.disable_traceback_filtering() ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb

File C:\Python310\lib\json\encoder.py:199, in JSONEncoder.encode(self, o) 195 return encode_basestring(o) 196 # This doesn't pass the iterator directly to ''.join() because the 197 # exceptions aren't as detailed. The list call should be roughly 198 # equivalent to the PySequence_Fast that ''.join() would do. --> 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): 201 chunks = list(chunks)

File C:\Python310\lib\json\encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot) 252 else: 253 _iterencode = _make_iterencode( ... 255 self.key_separator, self.item_separator, self.sort_keys, 256 self.skipkeys, _one_shot) --> 257 return _iterencode(o, 0)

TypeError: Unable to serialize [2.0897 2.1129 2.1082] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.

sushreebarsa commented 2 years ago

@jeromemassot In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here. Thank you!

jeromemassot commented 2 years ago

Hi @sushreebarsa Thanks for your reply. The code that I am using is the carbon-copy of this one available in the Keras example. https://github.com/keras-team/keras-io/blob/master/examples/vision/image_classification_efficientnet_fine_tuning.py

Two tiny differences:

1- The fit() method is using an early stopping callback as follows:

early_stopping_callback = tf.keras.callbacks.EarlyStopping( monitor='val_loss', min_delta=0.05, patience=3, restore_best_weights=True )

hist = model.fit( train_ds, epochs=epochs, steps_per_epoch=train_steps_per_epoch, validation_data=validation_ds, validation_steps=validation_steps_per_epoch, callbacks = [early_stopping_callback], verbose=1 )

2- the model.save('./models/EfficientNetB7_xx.h5py') after the training is completed. This last command creates the serialization problem.

Thanks for your help. Best regards Jerome

jbischof commented 2 years ago

@sushreebarsa, were you able to replicate? I don't see a gist

SuryanarayanaY commented 2 years ago

Hi @jbischof,

Iam unable to replicate the issue with exact code mentioned by @jeromemassot due to TPU issue with my Colab and also its a large model.But iam pretty sure that this is due to Serialization problem with efficientnet Model from tf.keras.applications.efficientnet. These models works fine with 2.9.2 version and having serialization issue from 2.10V and tf-nightly versions. Please refer the attached gist with minimal code to replicate the problem. The models saves without error if we use tf.saved_model.save() instead of model.save().Refer gist1 replicating issue with 2.10 & nightly versions and gist2 replicating issue for all efficientnet models.

All the above tested models including this particular issue keras-team/tf-keras#383 have same serialization error: TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'> Even there is no change in values of Serialize List.

@jeromemassot, Could you also please cross check whether the model saving works with 2.9.2 version and alternatively with tf.saved_model.save().

Kaschi14 commented 2 years ago

I get the exact same error with even the same numbers in the array which are not serializable. Downgrading tf doesn't work for me because of other dependencies... Anyone got any ideas?

Kaschi14 commented 2 years ago

@SuryanarayanaY Saving with tf.saved_model.save() works, however loading only works with tf.saved_model.load() which makes it pretty much useless, as it doesn't come as a keras object.

Trying to load the model with keras.models.load_model() results in a Value Error: Unable to create a Keras model from SavedModel at D:\model_weights\EfficientNet. This SavedModel was exported with 'tf.saved_model.save', and lacks the Keras metadata file. Please save your Keras model by calling 'model.save' or 'tf.keras.models.save_model'. Note that you can still load this SavedModel with 'tf.saved_model.load'.

Dobiasd commented 2 years ago

Same problem here:

import tensorflow as tf
model = tf.keras.applications.efficientnet.EfficientNetB0()
model.save('model.json')

->

Traceback (most recent call last):
  File "//main.py", line 4, in <module>
    model.save('model.json')
  File "/usr/local/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.

Here is a Dockerfile to easily reproduce it:

FROM python:3.10.8

RUN apt-get update
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y build-essential cmake

RUN pip3 install tensorflow==2.10.0

RUN echo "\n\
import tensorflow as tf\n\
model = tf.keras.applications.efficientnet.EfficientNetB0()\n\
model.save('model.json')\n\
" > main.py

RUN python3 main.py
maxcrous commented 1 year ago

Also facing this issue.

nbrasher commented 1 year ago

Encountered an identical issue with EfficientNetB1 and tensorflow-macos==2.10.0

SueGreen commented 1 year ago

Got the exact same issue with TF 2.10 and TF 2.11 (tried to save model in saved_model format and in H5 format, neither worked). Downgrading to TF2.9 helped for now but it would be really nice to be able to use TF more recent versions' features

hctomkins commented 1 year ago

NB if useful:

This issue seems to be triggered by the fact the hard-baked normalisation constants are getting evaluated to an EagerTensor before the scaling layer is built - see here

I fixed this locally by moving the logic into python:

At the top:

IMAGENET_STDDEV_RGB = [0.229, 0.224, 0.225]
IMAGENET_STDDEV_RGB = [1/math.sqrt(i) for i in IMAGENET_STDDEV_RGB]

Then on build just do: x = layers.Rescaling(IMAGENET_STDDEV_RGB)(x)

Don't have time to a raise a PR rn /w failure test cases (also suspect this isn't the most elegant solution), but thought at least a guide to hotfix might help for anyone that does!

jeromemassot commented 1 year ago

Thanks @hctomkins

So, is the TF team in the process to commit this fix in the future version of the EfficientNet code or should we continue to use this local fix in our codes?

Thanks Best regards Jerome

diricxbart commented 1 year ago

Thanks @hctomkins Based on your suggestion, for now we implemented the following workaround in our Dockerfile:

# Workaround for TF 2.10 & 2.11 issue: https://github.com/keras-team/tf-keras/issues/383: "efficientnetBx model.save() fails due to serialization problem with tf2.10.0"
RUN sudo sed -i 's/IMAGENET_STDDEV_RGB = \[0.229, 0.224, 0.225\]/IMAGENET_STDDEV_RGB = \[1 \/ math.sqrt(i) for i in \[0.229, 0.224, 0.225\]\]/g' \
    /usr/local/lib/python3.8/dist-packages/keras/applications/efficientnet.py && \
    sudo sed -i 's/x = layers.Rescaling(1.0 \/ tf.math.sqrt(IMAGENET_STDDEV_RGB))(x)/x = layers.Rescaling(IMAGENET_STDDEV_RGB)(x)/g' \
    /usr/local/lib/python3.8/dist-packages/keras/applications/efficientnet.py
berndporr commented 1 year ago

Same here: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON in 2.11.0. Easy to reproduce as I've followed simply the "Load video data" tutorial and used the keras model at the bottom. https://www.tensorflow.org/tutorials/load_data/video#next_steps then just add model.save() at the end and it will crash. Ironically, saving as a TF-lite model certainly works!

RaistlinD2x commented 1 year ago

I can see why people moved to PyTorch.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

sngjuk commented 1 year ago

Apply https://github.com/keras-team/keras/commit/5b931e64e262c3b44125fdb2534fb4a940cd6e79 fix manually (as @hctomkins mentioned)

location: lib/python3.10/site-packages/keras/applications/efficientnet.py (py3.10)

EDIT this:

x = layers.Rescaling(1.0 / tf.math.sqrt(IMAGENET_STDDEV_RGB))(x)

TO:

x = layers.Rescaling(
    [1.0 / math.sqrt(stddev) for stddev in IMAGENET_STDDEV_RGB]
)(x)
SuryanarayanaY commented 1 year ago

@jeromemassot ,

The PR merged to Master branch. I have tested the code with latest tf-nightly(2.13.0-dev20230409) and there is no error now. Please refer to attached gist.

If anybody still faces issue in tf-nightly please let us know.

It seems the commit not cherry picked to latest versions. I will convey it to concern team and let you know the status whether it can be cherry picked to TF2.12 version.Till then users requested to use tf-nightly.

Thanks!

RaistlinD2x commented 1 year ago

TF only works for most systems on 2.10 for GPU work so an update to 2.13 isn’t really useful. Google Colab uses 2.9.2, AWS uses 2.10, WSL for Windows to use GPU uses 2.10.

Jesse Richey


From: SuryanarayanaY @.> Sent: Monday, April 10, 2023 12:53:01 AM To: keras-team/keras @.> Cc: Jesse Richey @.>; Comment @.> Subject: Re: [keras-team/keras] efficientnetBx model.save() fails due to serialization problem with tf2.10.0 (Issue keras-team/tf-keras#383)

@jeromemassothttps://github.com/jeromemassot ,

The PR merged to Master branch. I have tested the code with latest tf-nightly(2.13.0-dev20230409) and there is no error now. Please refer to attached gisthttps://colab.research.google.com/gist/SuryanarayanaY/8983f306b822e1dec504d85af09d5a6b/17199-tf-nightly-2-13.ipynb.

If anybody still faces issue in tf-nightly please let us know.

It seems the commit not cherry picked to latest versions. I will convey it to concern team and let you know the status whether it can be cherry picked to TF2.12 version.Till then users requested to use tf-nightly.

Thanks!

— Reply to this email directly, view it on GitHubhttps://github.com/keras-team/tf-keras/issues/383, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARSKXX73HFYBCB2IGJUYPKDXAON33ANCNFSM6AAAAAARR62HOI. You are receiving this because you commented.Message ID: @.***>

arianamzp commented 1 year ago

just use efficientnetv2. it works like charm and does not have the same problem and it has better performance anyway