keras-team / tf-keras

The TensorFlow-specific implementation of the Keras API, which was the default Keras from 2019 to 2023.
Apache License 2.0
64 stars 30 forks source link

"No common supertype of ..." while loading model #431

Open ahasselbring opened 2 years ago

ahasselbring commented 2 years ago

System information.

Describe the problem.

I can't load a model when passing a variable to a submodel. See the example below. This explicitly tells me to "[...] file a bug if [...] being hindered by this error." Code like that worked with previous versions of TensorFlow (2.7 I think).

Describe the current behavior.

Loading fails with No common supertype of TensorSpec(...) and VariableSpec(...).

Describe the expected behavior.

The model is loaded without errors.

Contributing.

Standalone code to reproduce the issue.

import tensorflow as tf

class SubModel(tf.keras.models.Model):
    def call(self, x):
        return ()  # My actual model returns something depending on x here, but even this fails.

class MainModel(tf.keras.models.Model):
    def __init__(self):
        super().__init__()
        self.sub_model = SubModel()
        self.x = tf.Variable(tf.zeros((1,)))

    def call(self, _):
        return self.sub_model(self.x)

model = MainModel()
model(())
model.save("model")
model = tf.keras.models.load_model("model")

Source code / logs.

Output of the above script:

Traceback (most recent call last):
  File ".../minimal-example.py", line 19, in <module>
    model = tf.keras.models.load_model("model")
  File ".../.venv/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File ".../.venv/lib/python3.10/site-packages/keras/saving/saved_model/load.py", line 1151, in common_spec
    raise TypeError(f'No common supertype of {x} and {y}.')
TypeError: No common supertype of TensorSpec(shape=(None,), dtype=tf.float32, name=None) and VariableSpec(shape=(1,), dtype=<dtype: 'float32'>, trainable=True).
sushreebarsa commented 2 years ago

@ahasselbring I tried to replicate the issue on colab and faced following error;

ValueError: Types are not compatible: TensorSpec(shape=(None,), dtype=tf.float32, name=None) with type of <class 'tensorflow.python.framework.tensor_spec.TensorSpec'> vs VariableSpec(shape=(1,), dtype=tf.float32, name='x') with type of <class 'tensorflow.python.ops.resource_variable_ops.VariableSpec'>.

Could you find the gist here and confirm the same? Thank you!

ahasselbring commented 2 years ago

Indeed I get that error when I use TensorFlow 2.8, but with both 2.9 and 2.10 the error is as in the issue description:

No common supertype of TensorSpec(shape=(None,), dtype=tf.float32, name=None) and VariableSpec(shape=(1,), dtype=<dtype: 'float32'>, trainable=True).

coming from here (2.9) or here (2.10). Unfortunately I can't test with 2.7 anymore (I'm quite sure things like that worked with that version - maybe not this minimal example, but generally, passing variables to submodels, letting them do calculations and return them).

google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

gowthamkpr commented 2 years ago

@ahasselbring With tensorflow 2.7, there is a warning that states WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.

You can find the gist here.

ahasselbring commented 2 years ago

That warning can easily removed by adding compile=False to the load_model call in 2.7 (or by actually doing some training with a useful model, but that doesn't make sense in a minimal example). In newer versions, the error occurs regardless of that option.

haifeng-jin commented 2 years ago

Summry: This model cannot be save and load. (Minor modification made to the original snippet, tested with colab)

class SubModel(tf.keras.models.Model):
    def call(self, x):
        return x

class MainModel(tf.keras.models.Model):
    def __init__(self):
        super().__init__()
        self.sub_model = SubModel()
        self.x = tf.Variable(tf.zeros((1,)))

    def call(self, x):
        self.sub_model(self.x)
        return x
rchao commented 2 years ago

Thanks for reporting the issue - at this point we're lacking bandwidth to look further into this and community contribution for a potential fix is welcomed.

ahasselbring commented 2 years ago

I am working around this now by passing variable + tf.zeros_like(variable) instead of variable.

georgeyw commented 1 year ago

I have a similar issue with a model that uses RaggedTensors. I see an error message that looks something like "No common supertype of TensorSpec and RaggedTensorSpec".

I seem to be able to fix this by rolling back the code for infer_inputs_from_restored_call_function from r2.10 (found here: https://github.com/keras-team/keras/blob/r2.10/keras/saving/saved_model/load.py#L1293) to r2.7 (found here: https://github.com/keras-team/keras/blob/r2.7/keras/saving/saved_model/load.py#L1172, which also requires adding back get_common_shape right above it), but leaving everything else in r2.10 the same. It looks to me like this is the only place where this function is used.

I didn't seem to notice any problems with this in the local testing that I did, does anyone have context for why this was changed or know of side effects of rolling it back?

ThibaultDef commented 9 months ago

By looking directly into the code, I saw that the error comes from the function most_specific_common_supertype (within https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/type_spec.py). If I understood well, I compares types between tensors. If a type is not matching between two tensors (for example one is of type float32 and another one is of type float16), then the error raises.