NVIDIA-Merlin / models

Merlin Models is a collection of deep learning recommender system model reference implementations
https://nvidia-merlin.github.io/models/main/index.html
Apache License 2.0
262 stars 50 forks source link

[BUG] when we serve topK model on Triton it only returns scores #1220

Closed rnyak closed 10 months ago

rnyak commented 11 months ago

Bug description

when we serve topK model for a session-based model on Triton it only returns scores, but it also needs to serve topK ids together with the scores.

This issue is related to model signatures, the topK model output from model signature is only one, but it should be two.

Steps/Code to reproduce bug

Run this gist first. Check out the ensemble model output. you will see it has only one output noT two.

To check the Triton restponse, then do the following steps:

  1. Launch triton on terminal and load the model with tritonserver --model-repository={OUTPUT_DATA_DIR}/<name of ensemble folder>/

  2. prepare input data and send a request

from merlin.systems.triton import convert_df_to_triton_input
import tritonclient.grpc as grpcclient

validation_data = pd.read_parquet('/workspace/data/interactions_merged_df.parquet')
inputs = convert_df_to_triton_input(wf.input_schema, validation_data.iloc[:100])

with grpcclient.InferenceServerClient("localhost:8001") as client:
    response = client.infer('executor_model', inputs)

output = response.as_numpy('item_id-list/categorical_output')

Expected behavior

Environment details

Additional context