tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 320 forks source link

Keras TFOpLambda may conflict with quantization #867

Open hguandl opened 2 years ago

hguandl commented 2 years ago

Describe the bug

When we try to quantize a model containing a TFOpLambda layer, an AttributeError 'list' object has no attribute 'dtype' would occur.

System information

TensorFlow version (installed from binary): 2.5.0

TensorFlow Model Optimization version (installed from binary): 0.6.0

Python version: 3.9.6

Describe the expected behavior

Should return a quantized model without any error.

Describe the current behavior

AttributeError: 'list' object has no attribute 'dtype'

Code to reproduce the issue

https://colab.research.google.com/gist/hguandl/740df0dd42be2b220be26671c1e33119/oplambda-repro-minimal.ipynb

import tensorflow_model_optimization as tfmot
from tensorflow import keras
from tensorflow.python.keras.layers.core import TFOpLambda
from tensorflow.python.ops.math_ops import _add_dispatch
from tensorflow.python.util.tf_export import tf_export

@tf_export("test_lambda")
def custom_layer(tensor):
    return _add_dispatch(tensor, 2)

inputs = keras.Input(shape=(784,))
outputs = TFOpLambda(custom_layer)(inputs)

model = keras.models.Model(inputs=inputs, outputs=outputs)

q_model = tfmot.quantization.keras.quantize_model(model)

Screenshots

Please refer to results in the Colab above.

Additional context

The bug was first discussed in https://github.com/tensorflow/model-optimization/issues/546. I have found that TFOpLambda layers are forced to enable _preserve_input_structure_in_config which prevents the unwrapping of single-tensor lists. Therefore the input of lambda is a List rather than a Tensor type.

About TFOpLambda: tf_op_layer.py;

About unwrapping: funcional.py.

xiaotown123 commented 2 years ago

Is there any progress on this issue? @sngyhan @hguandl

daverim commented 2 years ago

Hi this is true, however, we generally do not support quantize_model for lambda operations. This is because the expectation is that quantization will be applied to supported ops, which is not true. Instead, it throws an error -- probably should expose lack of support for Lambda ops.

Instead, you should use a custom layer instead of a Lambda and provide a quantization config for this layer instead. For example, you would change to

class CustomLayer(tf.keras.layers.Layer):

  def call(self, tensor, training=False):
    return _add_dispatch(tensor, 2)

Which would provide a bit more control and for which you can apply quantization with

quantizers = tfmot.quantization.keras.quantizers
class CustomConfig(tfmot.quantization.keras.QuantizeConfig):
  def get_weights_and_quantizers(self, layer):
    return []

  def get_activations_and_quantizers(self, layer):
    return []

  def set_quantize_weights(self, layer, quantize_weights):
    pass

  def set_quantize_activations(self, layer, quantize_activations):
    pass

  def get_output_quantizers(self, layer):
    return [quantizers.MovingAverageQuantizer(
        num_bits=8, per_axis=False, symmetric=False, narrow_range=False)]

  def get_config(self):
    return {}
...
outputs = quantize_annotate_layer(CustomLayer(), CustomConfig())(x)

You will need to add these objects to the custom scope with:

with tfmot.quantization.keras.quantize_scope({'CustomLayer': CustomLayer,
                                              'CustomConfig': CustomConfig}):
  q_model = tfmot.quantization.keras.quantize_model(model)

If you really wish to use lambda layers with the full model quantize_model API, feel free to add a PR that we can take a look at, but unfortunately, like subclass layers, probably will not be supported automatically any time soon.

hguandl commented 2 years ago

So can I treat that as a documentation issue?

The official guide has not mentioned the incompatibility with lambda operations (only v1 and subclass are mentioned).

In addition, lambda layers are used in MobileNetV3, which is the built-in application from Keras. Developers may get confused when the quantization failed with an unclear error (#546).

Therefore I think it would be more friendly if Tensorflow would provide a list of incompatible built-in (tensorflow.python.keras.layers) operations for developers to check.

daverim commented 2 years ago

No, it is a bug -- there should be a best effort to quantize the output of the lambda layer and at least not cause a crash. I filed a bug internally with this issue as a reference. I just posted a workaround if the original model can be modified.

Will update when a fix is available.

MATTYGILO commented 2 years ago

Any updates on this I'm using TFOpLambda and trying to cluster but I'm getting the same bug

MATTYGILO commented 2 years ago

My error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [10], in <cell line: 1>()
----> 1 final_model = tfmot.clustering.keras.strip_clustering(clustered_model)
      3 final_model.summary()

File ~/.local/lib/python3.10/site-packages/tensorflow_model_optimization/python/core/clustering/keras/cluster.py:356, in strip_clustering(model)
    353   return layer
    355 # Just copy the model with the right callback
--> 356 return tf.keras.models.clone_model(
    357     model, input_tensors=None, clone_function=_strip_clustering_wrapper)

File ~/miniforge3/envs/net/lib/python3.10/site-packages/keras/models.py:456, in clone_model(model, input_tensors, clone_function)
    453   return _clone_sequential_model(
    454       model, input_tensors=input_tensors, layer_fn=clone_function)
    455 else:
--> 456   return _clone_functional_model(
    457       model, input_tensors=input_tensors, layer_fn=clone_function)

File ~/miniforge3/envs/net/lib/python3.10/site-packages/keras/models.py:197, in _clone_functional_model(model, input_tensors, layer_fn)
    193 model_configs, created_layers = _clone_layers_and_model_config(
    194     model, new_input_layers, layer_fn)
    195 # Reconstruct model from the config, using the cloned layers.
    196 input_tensors, output_tensors, created_layers = (
--> 197     functional.reconstruct_from_config(model_configs,
    198                                        created_layers=created_layers))
    199 metrics_names = model.metrics_names
    200 model = Model(input_tensors, output_tensors, name=model.name)

File ~/miniforge3/envs/net/lib/python3.10/site-packages/keras/engine/functional.py:1338, in reconstruct_from_config(config, custom_objects, created_layers)
   1336 while layer_nodes:
   1337   node_data = layer_nodes[0]
-> 1338   if process_node(layer, node_data):
   1339     layer_nodes.pop(0)
   1340   else:
   1341     # If a node can't be processed, stop processing the nodes of
   1342     # the current layer to maintain node ordering.

File ~/miniforge3/envs/net/lib/python3.10/site-packages/keras/engine/functional.py:1282, in reconstruct_from_config.<locals>.process_node(layer, node_data)
   1279 if not layer._preserve_input_structure_in_config:
   1280   input_tensors = (
   1281       base_layer_utils.unnest_if_single_tensor(input_tensors))
-> 1282 output_tensors = layer(input_tensors, **kwargs)
   1284 # Update node index map.
   1285 output_index = (tf.nest.flatten(output_tensors)[0].
   1286                 _keras_history.node_index)

File ~/miniforge3/envs/net/lib/python3.10/site-packages/keras/utils/traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     65 except Exception as e:  # pylint: disable=broad-except
     66   filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67   raise e.with_traceback(filtered_tb) from None
     68 finally:
     69   del filtered_tb

File ~/miniforge3/envs/net/lib/python3.10/site-packages/tensorflow/python/ops/math_ops.py:1733, in _add_dispatch(x, y, name)
   1712 """The operation invoked by the `Tensor.__add__` operator.
   1713 
   1714   Purpose in the API:
   (...)
   1729   The result of the elementwise `+` operation.
   1730 """
   1731 if not isinstance(y, ops.Tensor) and not isinstance(
   1732     y, sparse_tensor.SparseTensor):
-> 1733   y = ops.convert_to_tensor(y, dtype_hint=x.dtype.base_dtype, name="y")
   1734 if x.dtype == dtypes.string:
   1735   return gen_math_ops.add(x, y, name=name)
MATTYGILO commented 2 years ago

I was thinking perhaps it would be great if you could exclude certain layers from the clustering. Simply ignore Lambda operations

pkohn19 commented 2 years ago

Any news on this? How could we skip this bug to perform QAT on MobileNetV3?

MATTYGILO commented 2 years ago

@pkohn19 Have a read of my stackoverflow question https://stackoverflow.com/questions/72361837/quantisation-of-custom-model-with-custom-layer-full-int-8

juancresc commented 11 months ago

this is still an ongoing issue

/local_disk0/.ephemeral_nfs/envs/pythonEnv-923cb15a-8291-4e8e-a00d-daa077cfa7ee/lib/python3.9/site-packages/tensorflow_model_optimization/python/core/quantization/keras/quantize.py:216: UserWarning: Lambda layers are not supported by automatic model annotation because the internal functionality cannot always be determined by serialization alone. We recommend that you make a custom layer and add a custom QuantizeConfig for it instead. This layer will not be quantized which may lead to unexpected results.
  warnings.warn(
ValueError: Exception encountered when calling layer "sum_group_apps" (type Lambda).

None values not supported.
pedrofrodenas commented 2 days ago

I have the same problem, I am using tensorflow 2.17, tf_keras 2.17

I am importing Mobilenetv3 small keras model

model = keras.applications.MobileNetV3Small( input_shape=tuple(input_shape), alpha=width_multiplier, minimalistic=False, include_top=True, weights="imagenet", input_tensor=None, classes=1000, pooling=None, dropout_rate=0.2, classifier_activation="softmax", include_preprocessing=include_preprocessing, )

After appliying quantization :

q_aware_model = quantize_model(model)

AttributeError: Exception encountered when calling layer "tf.operators.add" (type TFOpLambda).

'list' object has no attribute 'dtype'

Call arguments received by layer "tf.operators.add" (type TFOpLambda): • x=['tf.Tensor(shape=(None, 112, 192, 16), dtype=float32)'] • y=3.0 • name=None