NVIDIA-Merlin / Merlin

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Apache License 2.0
723 stars 113 forks source link

[RMP] Provide to our customers best practices guidance for training Retrieval, Ranking and Multi-Stage RecSys models #553

Closed gabrielspmoreira closed 1 year ago

gabrielspmoreira commented 1 year ago

Problem:

Merlin platform provides tools for building multi-stage recommender systems (which we are going to present at RecSys’22 conference on our tutorial and demo). In particular, the retrieval and ranking stages are implemented in Merlin Models library and tied together during inference with Merlin Systems library. Although we provide to our customers an implementation of retrieval and ranking models, and a few example notebooks with toy datasets to get them started, it might be hard for our customers to obtain a reasonable accuracy and performance without too much experimentation on their side with their datasets. In addition, it is expected that customers find some issues in their experiments related to our API flexibility and to models accuracy/performance with real datasets. This scenario might reduce customers’ interest and engagement if they think Merlin has not been well experimented and refined with real datasets and is not mature enough for their purposes.

Goal:

The goal of this work is to improve customers experience when starting to experiment Merlin with their own datasets. We want to leverage NVIDIA internal research and computational resources to perform comprehensive experimentation of our retrieval and ranking models with a diversity of public datasets, so that we can:

Constraints:

Starting Point:

Our research team has already done or is doing experimentation work for some RecSys use cases:

Tasks

Ranking Models

Retrieval Models

Finish the refactory of the YouTubeDNN retrieval model

Two-stage recommendation

Investigate/research how to better train two-stage recsys pipelines

EvenOldridge commented 1 year ago

@gabrielspmoreira @viswa-nvidia to update checkboxes and convert to issues across the releases.

gabrielspmoreira commented 1 year ago

fyi, I have updated the check boxes and created the tasks a few days ago

gabrielspmoreira commented 1 year ago

Created some slides with an overview of the proposal to discuss in the next grooming meeting

gabrielspmoreira commented 1 year ago

I have written a new RMP NVIDIA-Merlin/Merlin#732 as a rewrite of this one in a more customer-centric and pragmatic approach: a quick start pipeline for customers. I keep this RMP unchanged as it was prioritized, but propose closing it and taking the NVIDIA-Merlin/Merlin#732 as the replacement. The new one also reduces the scope in terms of dataset experiments, as we are going to stick initially to a single dataset (TenRec) as an example on how to use the quick-start template pipeline.

gabrielspmoreira commented 1 year ago

@viswa-nvidia @EvenOldridge I am closing this ticket, as I rewrote it in a customer-centric approach in NVIDIA-Merlin/Merlin#732 , which is now assigned to 23.01