Closed vlasenkoalexey closed 1 year ago
Hi Aleksey,
There seams to be an issue of saving function traces for the custom Cross layer (not sure why), as a workaround could you please try passing save_traces=False
and see if that works for you.
model.save('./criteo/kaggle/saved_model/1', include_optimizer=False, save_traces=False)
Hi @vlasenkoalexey,
Could you please let us know if above workaround fixes your error.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/official/recommendation/ranking
2. Describe the bug
Can't train DCN v2 model as described in https://github.com/tensorflow/models/tree/master/official/recommendation/ranking Model can be trained as DLRM (interaction: 'dot'), when training as DCN v2 (interaction: 'cross') I got following warnings: W1123 17:31:52.065743 139786891163392 sequential.py:362] Layers in a Sequential model should only have a single input tensor, but we receive a <class 'list'> input: [<tf.Tensor 'ranking/Squeeze:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_1:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_2:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_3:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_4:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_5:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_6:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_7:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_8:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_9:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_10:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_11:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_12:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_13:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_14:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_15:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_16:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_17:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_18:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_19:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_20:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_21:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_22:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_23:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_24:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/Squeeze_25:0' shape=(512, 32) dtype=float32>, <tf.Tensor 'ranking/mlp/dense_2/Relu:0' shape=(512, 32) dtype=float32>] Consider rewriting this model with the Functional API.
When I'm trying to export this model by adding model.save('...../criteo/kaggle/saved_model/1', include_optimizer=False) after model.fit it fails with following error.
` Traceback (most recent call last): File "models/official/recommendation/ranking/train.py", line 192, in
app.run(main)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "models/official/recommendation/ranking/train.py", line 180, in main
model.save('/home/jupyter/models/criteo/kaggle/saved_model/1')
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 2146, in save
signatures, options, save_traces)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/save.py", line 150, in save_model
signatures, options, save_traces)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/save.py", line 91, in save
model, filepath, signatures, options)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1228, in save_and_return_nodes
_build_meta_graph(obj, signatures, options, meta_graph_def))
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1399, in _build_meta_graph
return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 1336, in _build_meta_graph_impl
checkpoint_graph_view)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/signature_serialization.py", line 99, in find_function_to_export
functions = saveable_view.list_functions(saveable_view.root)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/saved_model/save.py", line 164, in list_functions
self._serialization_cache)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 2813, in _list_functions_for_serialization
Model, self)._list_functions_for_serialization(serialization_cache)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/base_layer.py", line 3086, in _list_functions_for_serialization
.list_functions_for_serialization(serialization_cache))
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/base_serialization.py", line 93, in list_functions_for_serialization
fns = self.functions_to_serialize(serialization_cache)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py", line 74, in functions_to_serialize
serialization_cache).functions_to_serialize)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py", line 90, in _get_serialized_attributes
serialization_cache)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/model_serialization.py", line 51, in _get_serialized_attributes_internal
default_signature = save_impl.default_save_signature(self.obj)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saved_model/save_impl.py", line 208, in default_save_signature
fn = saving_utils.trace_model_call(layer)
File "/opt/conda/lib/python3.7/site-packages/keras/saving/saving_utils.py", line 135, in trace_model_call
return _wrapped_model.get_concrete_function(*model_args, model_kwargs)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1233, in get_concrete_function
concrete = self._get_concrete_function_garbage_collected(*args, *kwargs)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 1213, in _get_concrete_function_garbage_collected
self._initialize(args, kwargs, add_initializers_to=initializers)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 760, in _initialize
args, kwds))
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3066, in _get_concrete_function_internal_garbage_collected
graphfunction, = self._maybe_define_function(args, kwargs)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3463, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3308, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 1007, in func_graph_from_py_func
func_outputs = python_func(*func_args, *func_kwargs)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 668, in wrapped_fn
out = weak_wrapped_fn().wrapped(args, **kwds)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 994, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
`
DLRM model can be exported just fine.
3. Steps to reproduce
Add following line after model.fit: model.save('...../criteo/kaggle/saved_model/1', include_optimizer=False) and train model in DCN v2 (interaction: 'cross') as described in README.
4. Expected behavior
Model should be trained and exported.
5. Additional context
Include any logs that would be helpful to diagnose the problem.
6. System information
v2.6.0-rc2-32-g919f693420e 2.6.0