Add topk arg to return topk items and scores at inference step

rnyak commented 1 year ago

This PR adds functionality for returning topk most relevant (with the highest scores) item ids from Triton IS, for NextItemPrediction task.

Current blocker:

~~The code designed to return top_k item ids (int 64 dtype), but model.output_schema returns next_item as float32 dtype, which creates an error from Triton.~~

Shall we change the code base in a way that model.output_schema matches with the expected output and output dtype from Triton? Or shall we return top_k item id scores, instead of item_ids?

Status update:

After modifying the model.output_schema, we can now return two outputs (item_scores, item_ids) from Triton.

Remaining tasks:

[x] be sure the dtype of categorical item-id in the model.output schema matches with the model.input_schema
[x] add a unit test
[x] add an example notebook to showcase topK layer or modify one of the existing example: will be taken care of by this PR https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/680

github-actions[bot] commented 1 year ago

Documentation preview

https://nvidia-merlin.github.io/Transformers4Rec/review/pr-678

rnyak commented 1 year ago

rerun tests

rnyak commented 1 year ago

rerun tests

NVIDIA-Merlin / Transformers4Rec

Add topk arg to return topk items and scores at inference step #678

Documentation preview