tensorflow / models

Models and examples built with TensorFlow
Other
76.94k stars 45.79k forks source link

DCN v2 model can't be exported #10390

Closed vlasenkoalexey closed 1 year ago

vlasenkoalexey commented 2 years ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/official/recommendation/ranking

2. Describe the bug

Can't train DCN v2 model as described in https://github.com/tensorflow/models/tree/master/official/recommendation/ranking Model can be trained as DLRM (interaction: 'dot'), when training as DCN v2 (interaction: 'cross') I got following warnings: W1123 17:31:52.065743 139786891163392 sequential.py:362] Layers in a Sequential model should only have a single input tensor, but we receive a <class 'list'> input: [<tf.Tensor 'ranking/Squeeze:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_1:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_2:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_3:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_4:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_5:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_6:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_7:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_8:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_9:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_10:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_11:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_12:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_13:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_14:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_15:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_16:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_17:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_18:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_19:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_20:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_21:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_22:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_23:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_24:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_25:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/mlp/dense_2/Relu:0' shape=(512, 32) dtype=float32>] Consider rewriting this model with the Functional API.

When I'm trying to export this model by adding model.save('...../criteo/kaggle/saved_model/1', include_optimizer=False) after model.fit it fails with following error.

` Traceback (most recent call last): File "models/official/recommendation/ranking/train.py", line 192, in app.run(main) File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 303, in run _run_main(main, args) File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "models/official/recommendation/ranking/train.py", line 180, in main model.save('/home/jupyter/models/criteo/kaggle/saved_model/1') File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 2146, in save signatures, options, save_traces) File "/opt/conda/lib/python3.7/site-packages/keras/saving/save.py", line 150, in save_model signatures, options, save_traces) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/save.py", line 91, in save model, filepath, signatures, options) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1228, in save_and_return_nodes _build_meta_graph(obj, signatures, options, meta_graph_def)) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1399, in _build_meta_graph return _build_meta_graph_impl(obj, signatures, options, meta_graph_def) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1336, in _build_meta_graph_impl checkpoint_graph_view) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/signature_serialization.py", line 99, in find_function_to_export functions = saveable_view.list_functions(saveable_view.root) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 164, in list_functions self._serialization_cache) File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 2813, in _list_functions_for_serialization Model, self)._list_functions_for_serialization(serialization_cache) File "/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py", line 3086, in _list_functions_for_serialization .list_functions_for_serialization(serialization_cache)) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/base_serialization.py", line 93, in list_functions_for_serialization fns = self.functions_to_serialize(serialization_cache) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py", line 74, in functions_to_serialize serialization_cache).functions_to_serialize) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py", line 90, in _get_serialized_attributes serialization_cache) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/model_serialization.py", line 51, in _get_serialized_attributes_internal default_signature = save_impl.default_save_signature(self.obj) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/save_impl.py", line 208, in default_save_signature fn = saving_utils.trace_model_call(layer) File "/opt/conda/lib/python3.7/site-packages/keras/saving/saving_utils.py", line 135, in trace_model_call return _wrapped_model.get_concrete_function(*model_args, model_kwargs) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1233, in get_concrete_function concrete = self._get_concrete_function_garbage_collected(*args, *kwargs) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1213, in _get_concrete_function_garbage_collected self._initialize(args, kwargs, add_initializers_to=initializers) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 760, in _initialize args, kwds)) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3066, in _get_concrete_function_internal_garbage_collected graphfunction, = self._maybe_define_function(args, kwargs) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3463, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3308, in _create_graph_function capture_by_value=self._capture_by_value), File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1007, in func_graph_from_py_func func_outputs = python_func(*func_args, *func_kwargs) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 668, in wrapped_fn out = weak_wrapped_fn().wrapped(args, **kwds) File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 994, in wrapper raise e.ag_error_metadata.to_exception(e) ValueError: in user code:

/opt/conda/lib/python3.7/site-packages/keras/saving/saving_utils.py:125 _wrapped_model  *
    outputs = model(*args, **kwargs)
/opt/conda/lib/python3.7/site-packages/tensorflow_recommenders/experimental/models/ranking.py:213 call  *
    interaction_output = self._feature_interaction(interaction_args)
/opt/conda/lib/python3.7/site-packages/tensorflow_recommenders/layers/feature_interaction/dcn.py:167 call  *
    prod_output = self._dense(x)
/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py:1020 __call__  **
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
/opt/conda/lib/python3.7/site-packages/keras/engine/input_spec.py:254 assert_input_compatibility
    ' but received input with shape ' + display_shape(x.shape))

ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 864 but received input with shape (None, 32)

`

DLRM model can be exported just fine.

3. Steps to reproduce

Add following line after model.fit: model.save('...../criteo/kaggle/saved_model/1', include_optimizer=False) and train model in DCN v2 (interaction: 'cross') as described in README.

4. Expected behavior

Model should be trained and exported.

5. Additional context

Include any logs that would be helpful to diagnose the problem.

6. System information

v2.6.0-rc2-32-g919f693420e 2.6.0

gagika commented 2 years ago

Hi Aleksey, There seams to be an issue of saving function traces for the custom Cross layer (not sure why), as a workaround could you please try passing save_traces=False and see if that works for you.

model.save('./criteo/kaggle/saved_model/1', include_optimizer=False, save_traces=False)
laxmareddyp commented 1 year ago

Hi @vlasenkoalexey,

Could you please let us know if above workaround fixes your error.

google-ml-butler[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 1 year ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No