Closed jeromemassot closed 1 year ago
@jeromemassot In order to expedite the trouble-shooting process, please provide a code snippet to reproduce the issue reported here. Thank you!
Hi @sushreebarsa Thanks for your reply. The code that I am using is the carbon-copy of this one available in the Keras example. https://github.com/keras-team/keras-io/blob/master/examples/vision/image_classification_efficientnet_fine_tuning.py
Two tiny differences:
1- The fit() method is using an early stopping callback as follows:
early_stopping_callback = tf.keras.callbacks.EarlyStopping( monitor='val_loss', min_delta=0.05, patience=3, restore_best_weights=True )
hist = model.fit( train_ds, epochs=epochs, steps_per_epoch=train_steps_per_epoch, validation_data=validation_ds, validation_steps=validation_steps_per_epoch, callbacks = [early_stopping_callback], verbose=1 )
2- the model.save('./models/EfficientNetB7_xx.h5py') after the training is completed. This last command creates the serialization problem.
Thanks for your help. Best regards Jerome
@sushreebarsa, were you able to replicate? I don't see a gist
Hi @jbischof,
Iam unable to replicate the issue with exact code mentioned by @jeromemassot due to TPU issue with my Colab and also its a large model.But iam pretty sure that this is due to Serialization problem with efficientnet Model from tf.keras.applications.efficientnet. These models works fine with 2.9.2 version and having serialization issue from 2.10V and tf-nightly versions. Please refer the attached gist with minimal code to replicate the problem. The models saves without error if we use tf.saved_model.save()
instead of model.save()
.Refer gist1 replicating issue with 2.10 & nightly versions and gist2 replicating issue for all efficientnet models.
All the above tested models including this particular issue keras-team/tf-keras#383 have same serialization error:
TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>
Even there is no change in values of Serialize List.
@jeromemassot, Could you also please cross check whether the model saving works with 2.9.2 version
and alternatively with tf.saved_model.save()
.
I get the exact same error with even the same numbers in the array which are not serializable. Downgrading tf doesn't work for me because of other dependencies... Anyone got any ideas?
@SuryanarayanaY
Saving with tf.saved_model.save()
works, however loading only works with tf.saved_model.load()
which makes it pretty much useless, as it doesn't come as a keras object.
Trying to load the model with keras.models.load_model()
results in a Value Error: Unable to create a Keras model from SavedModel at D:\model_weights\EfficientNet. This SavedModel was exported with 'tf.saved_model.save', and lacks the Keras metadata file. Please save your Keras model by calling 'model.save' or 'tf.keras.models.save_model'. Note that you can still load this SavedModel with 'tf.saved_model.load'.
Same problem here:
import tensorflow as tf
model = tf.keras.applications.efficientnet.EfficientNetB0()
model.save('model.json')
->
Traceback (most recent call last):
File "//main.py", line 4, in <module>
model.save('model.json')
File "/usr/local/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.10/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/local/lib/python3.10/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.
Here is a Dockerfile
to easily reproduce it:
FROM python:3.10.8
RUN apt-get update
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y build-essential cmake
RUN pip3 install tensorflow==2.10.0
RUN echo "\n\
import tensorflow as tf\n\
model = tf.keras.applications.efficientnet.EfficientNetB0()\n\
model.save('model.json')\n\
" > main.py
RUN python3 main.py
Also facing this issue.
Encountered an identical issue with EfficientNetB1 and tensorflow-macos==2.10.0
Got the exact same issue with TF 2.10 and TF 2.11 (tried to save model in saved_model format and in H5 format, neither worked). Downgrading to TF2.9 helped for now but it would be really nice to be able to use TF more recent versions' features
NB if useful:
This issue seems to be triggered by the fact the hard-baked normalisation constants are getting evaluated to an EagerTensor before the scaling layer is built - see here
I fixed this locally by moving the logic into python:
At the top:
IMAGENET_STDDEV_RGB = [0.229, 0.224, 0.225]
IMAGENET_STDDEV_RGB = [1/math.sqrt(i) for i in IMAGENET_STDDEV_RGB]
Then on build just do:
x = layers.Rescaling(IMAGENET_STDDEV_RGB)(x)
Don't have time to a raise a PR rn /w failure test cases (also suspect this isn't the most elegant solution), but thought at least a guide to hotfix might help for anyone that does!
Thanks @hctomkins
So, is the TF team in the process to commit this fix in the future version of the EfficientNet code or should we continue to use this local fix in our codes?
Thanks Best regards Jerome
Thanks @hctomkins Based on your suggestion, for now we implemented the following workaround in our Dockerfile:
# Workaround for TF 2.10 & 2.11 issue: https://github.com/keras-team/tf-keras/issues/383: "efficientnetBx model.save() fails due to serialization problem with tf2.10.0"
RUN sudo sed -i 's/IMAGENET_STDDEV_RGB = \[0.229, 0.224, 0.225\]/IMAGENET_STDDEV_RGB = \[1 \/ math.sqrt(i) for i in \[0.229, 0.224, 0.225\]\]/g' \
/usr/local/lib/python3.8/dist-packages/keras/applications/efficientnet.py && \
sudo sed -i 's/x = layers.Rescaling(1.0 \/ tf.math.sqrt(IMAGENET_STDDEV_RGB))(x)/x = layers.Rescaling(IMAGENET_STDDEV_RGB)(x)/g' \
/usr/local/lib/python3.8/dist-packages/keras/applications/efficientnet.py
Same here: Unable to serialize [2.0896919 2.1128857 2.1081853] to JSON
in 2.11.0.
Easy to reproduce as I've followed simply the "Load video data" tutorial and used the keras model at the bottom.
https://www.tensorflow.org/tutorials/load_data/video#next_steps
then just add model.save()
at the end and it will crash. Ironically, saving as a TF-lite model certainly works!
I can see why people moved to PyTorch.
Apply https://github.com/keras-team/keras/commit/5b931e64e262c3b44125fdb2534fb4a940cd6e79 fix manually (as @hctomkins mentioned)
location: lib/python3.10/site-packages/keras/applications/efficientnet.py
(py3.10)
EDIT this:
x = layers.Rescaling(1.0 / tf.math.sqrt(IMAGENET_STDDEV_RGB))(x)
TO:
x = layers.Rescaling(
[1.0 / math.sqrt(stddev) for stddev in IMAGENET_STDDEV_RGB]
)(x)
@jeromemassot ,
The PR merged to Master branch. I have tested the code with latest tf-nightly(2.13.0-dev20230409)
and there is no error now. Please refer to attached gist.
If anybody still faces issue in tf-nightly please let us know.
It seems the commit not cherry picked to latest versions. I will convey it to concern team and let you know the status whether it can be cherry picked to TF2.12 version.Till then users requested to use tf-nightly
.
Thanks!
TF only works for most systems on 2.10 for GPU work so an update to 2.13 isn’t really useful. Google Colab uses 2.9.2, AWS uses 2.10, WSL for Windows to use GPU uses 2.10.
Jesse Richey
From: SuryanarayanaY @.> Sent: Monday, April 10, 2023 12:53:01 AM To: keras-team/keras @.> Cc: Jesse Richey @.>; Comment @.> Subject: Re: [keras-team/keras] efficientnetBx model.save() fails due to serialization problem with tf2.10.0 (Issue keras-team/tf-keras#383)
@jeromemassothttps://github.com/jeromemassot ,
The PR merged to Master branch. I have tested the code with latest tf-nightly(2.13.0-dev20230409) and there is no error now. Please refer to attached gisthttps://colab.research.google.com/gist/SuryanarayanaY/8983f306b822e1dec504d85af09d5a6b/17199-tf-nightly-2-13.ipynb.
If anybody still faces issue in tf-nightly please let us know.
It seems the commit not cherry picked to latest versions. I will convey it to concern team and let you know the status whether it can be cherry picked to TF2.12 version.Till then users requested to use tf-nightly.
Thanks!
— Reply to this email directly, view it on GitHubhttps://github.com/keras-team/tf-keras/issues/383, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARSKXX73HFYBCB2IGJUYPKDXAON33ANCNFSM6AAAAAARR62HOI. You are receiving this because you commented.Message ID: @.***>
just use efficientnetv2. it works like charm and does not have the same problem and it has better performance anyway
System information.
Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.
Describe the current behavior. model save() fails and reports a serialization problem.
Describe the expected behavior. saving keras model without error.
Contributing.
Source code / logs.
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 273). These functions will not be directly callable after loading. INFO:tensorflow:Assets written to: ./models/EfficientNetB7_Naiads.h5py\assets INFO:tensorflow:Assets written to: ./models/EfficientNetB7_Naiads.h5py\assets Output exceeds the size limit. Open the full output data in a text editor
TypeError Traceback (most recent call last) Cell In [31], line 1 ----> 1 model.save('./models/EfficientNetB7_Naiads.h5py')
File e:\02- Vision Projects\01- Naiads Projects\notebooks.venv\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 #
tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tbFile C:\Python310\lib\json\encoder.py:199, in JSONEncoder.encode(self, o) 195 return encode_basestring(o) 196 # This doesn't pass the iterator directly to ''.join() because the 197 # exceptions aren't as detailed. The list call should be roughly 198 # equivalent to the PySequence_Fast that ''.join() would do. --> 199 chunks = self.iterencode(o, _one_shot=True) 200 if not isinstance(chunks, (list, tuple)): 201 chunks = list(chunks)
File C:\Python310\lib\json\encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot) 252 else: 253 _iterencode = _make_iterencode( ... 255 self.key_separator, self.item_separator, self.sort_keys, 256 self.skipkeys, _one_shot) --> 257 return _iterencode(o, 0)
TypeError: Unable to serialize [2.0897 2.1129 2.1082] to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.