NVIDIA-Merlin / systems

Merlin Systems provides tools for combining recommendation models with other elements of production recommender systems (like feature stores, nearest neighbor search, and exploration strategies) into end-to-end recommendation pipelines that can be served with Triton Inference Server.
Apache License 2.0
89 stars 29 forks source link

[BUG] Unable to serve a topK session-based model on Triton #383

Open rnyak opened 12 months ago

rnyak commented 12 months ago

Bug description

I am trying to serve a session-based topk_model that is generated using to_top_k_encoder method using Merlin models library.

How ever I am getting the following error from the line below:

inf_ops = wf.input_schema.column_names >> TransformWorkflow(wf) >> PredictTensorflow(topk_model)
ensemble = Ensemble(inf_ops, wf.input_schema)
ensemble.export(os.path.join('/workspace/Lowes/', 'ensemble_topk'))
INFO:tensorflow:Assets written to: /workspace/Lowes/ensemble_topk/1_predicttensorflowtriton/1/model.savedmodel/assets
INFO:tensorflow:Assets written to: /workspace/Lowes/ensemble_topk/1_predicttensorflowtriton/1/model.savedmodel/assets
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[68], line 1
----> 1 ensemble.export(os.path.join('/workspace/Lowes/', 'ensemble_topk'))

File /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ensemble.py:153, in Ensemble.export(self, export_path, runtime, **kwargs)
    148 """
    149 Write out an ensemble model configuration directory. The exported
    150 ensemble is designed for use with Triton Inference Server.
    151 """
    152 runtime = runtime or TritonExecutorRuntime()
--> 153 return runtime.export(self, export_path, **kwargs)

File /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/runtimes/triton/runtime.py:133, in TritonExecutorRuntime.export(self, ensemble, path, version, name)
    131 node_id = node_id_table.get(node, None)
    132 if node_id is not None:
--> 133     node_config = node.op.export(
    134         path, node.input_schema, node.output_schema, node_id=node_id, version=version
    135     )
    136     if node_config is not None:
    137         node_configs.append(node_config)

File /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/runtimes/triton/ops/tensorflow.py:113, in PredictTensorflowTriton.export(self, path, input_schema, output_schema, params, node_id, version)
    107     copytree(
    108         str(self.path),
    109         tf_model_path,
    110         dirs_exist_ok=True,
    111     )
    112 else:
--> 113     self.model.save(tf_model_path, include_optimizer=False)
    115 self.set_tf_model_name(node_name)
    116 backend_model_config = self._export_model_config(node_name, node_export_path)

File /usr/local/lib/python3.8/dist-packages/merlin/models/tf/core/encoder.py:335, in Encoder.save(self, export_path, include_optimizer, save_traces)
    315 def save(
    316     self,
    317     export_path: Union[str, os.PathLike],
    318     include_optimizer=True,
    319     save_traces=True,
    320 ) -> None:
    321     """Saves the model to export_path as a Tensorflow Saved Model.
    322     Along with merlin model metadata.
    323 
   (...)
    333         stored, by default True
    334     """
--> 335     super().save(
    336         export_path,
    337         include_optimizer=include_optimizer,
    338         save_traces=save_traces,
    339         save_format="tf",
    340     )
    341     input_schema = self.schema
    342     output_schema = get_output_schema(export_path)

File /usr/local/lib/python3.8/dist-packages/merlin/models/tf/models/base.py:1613, in BaseModel.save(self, *args, **kwargs)
   1611 if hvd_installed and hvd.rank() != 0:
   1612     return
-> 1613 super().save(*args, **kwargs)

File /usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File /usr/lib/python3.8/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File /usr/lib/python3.8/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

TypeError: Unable to serialize item_id-list to JSON. Unrecognized type <class 'merlin.schema.schema.ColumnSchema'>.

Steps/Code to reproduce bug

Please run the code in this gist to repro the issue.

Expected behavior

We should be able to serve a topk_model on Triton and be able to return topK scores and indices.

Environment details

I am using merlin-tensorflow:23.06 and pulling main branches for each library.