NVIDIA-Merlin / models

Merlin Models is a collection of deep learning recommender system model reference implementations
https://nvidia-merlin.github.io/models/main/index.html
Apache License 2.0
262 stars 50 forks source link

Ensure TopKEncoder has correct outputs when model is saved #1225

Closed oliverholworthy closed 10 months ago

oliverholworthy commented 11 months ago

Fixes #1220

Goals :soccer:

Ensure top-k encoder model outputs are correct when model is saved.

Implementation Details :construction:

Removing output_names in BaseModel.compile when we have a TopKOutput. This enables keras to figure out the output names correctly.

We currently set the output name based on the name of the model output. However, the BruteForce TopKLayer outputs a 2-tuple of scores and ids. Setting model.output_names to ["top_k_output"] keras writes out the model with only one output with this name. The relevant part in the keras code (2.12) is here

Before

The given SavedModel SignatureDef contains the following input(s):
  inputs['category-list__offsets'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: serving_default_category-list__offsets:0
  inputs['category-list__values'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_category-list__values:0
  inputs['dayofweek-first'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_dayofweek-first:0
  inputs['item_id-list__offsets'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: serving_default_item_id-list__offsets:0
  inputs['item_id-list__values'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_item_id-list__values:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['item_id-list/top_k_output'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 100)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

After

The given SavedModel SignatureDef contains the following input(s):
  inputs['category-list__offsets'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: serving_default_category-list__offsets:0
  inputs['category-list__values'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_category-list__values:0
  inputs['dayofweek-first'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_dayofweek-first:0
  inputs['item_id-list__offsets'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: serving_default_item_id-list__offsets:0
  inputs['item_id-list__values'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_item_id-list__values:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['identifiers'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 100)
      name: StatefulPartitionedCall:0
  outputs['scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 100)
      name: StatefulPartitionedCall:1
Method name is: tensorflow/serving/predict

Testing Details :mag:

Adds assertion for expected output signature in test of topk encoder

github-actions[bot] commented 11 months ago

Documentation preview

https://nvidia-merlin.github.io/models/review/pr-1225