NVIDIA-Merlin / Transformers4Rec

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
https://nvidia-merlin.github.io/Transformers4Rec/main
Apache License 2.0
1.12k stars 148 forks source link

Add topk arg to return topk items and scores at inference step #678

Closed rnyak closed 1 year ago

rnyak commented 1 year ago

This PR adds functionality for returning topk most relevant (with the highest scores) item ids from Triton IS, for NextItemPrediction task.

Current blocker:

The code designed to return top_k item ids (int 64 dtype), but model.output_schema returns next_item as float32 dtype, which creates an error from Triton.

Shall we change the code base in a way that model.output_schema matches with the expected output and output dtype from Triton? Or shall we return top_k item id scores, instead of item_ids?

Status update:

After modifying the model.output_schema, we can now return two outputs (item_scores, item_ids) from Triton.

Remaining tasks:

github-actions[bot] commented 1 year ago

Documentation preview

https://nvidia-merlin.github.io/Transformers4Rec/review/pr-678

rnyak commented 1 year ago

rerun tests

rnyak commented 1 year ago

rerun tests