NVIDIA Merlin is an open source library that accelerates recommender systems on NVIDIA GPUs. The library enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common feature engineering, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. For more information, see NVIDIA Merlin on the NVIDIA developer web site.
NVIDIA Merlin is a scalable and GPU-accelerated solution, making it easy to build recommender systems from end to end. With NVIDIA Merlin, you can:
NVIDIA Merlin consists of the following open source libraries:
NVTabular
NVTabular is a feature engineering and preprocessing library for tabular
data. The library can quickly and easily manipulate terabyte-size datasets that
are used to train deep learning based recommender systems. The library offers a
high-level API that can define complex data transformation workflows. With
NVTabular, you can:
HugeCTR
HugeCTR is a GPU-accelerated training framework that can scale large deep learning
recommendation models by distributing training across multiple GPUs and nodes.
HugeCTR contains optimized data loaders with GPU-acceleration and provides
strategies for scaling large embedding tables beyond available memory. With
HugeCTR, you can:
Merlin Models
The Merlin Models library provides standard models for recommender systems with
an aim for high-quality implementations that range from classic machine learning
models to highly-advanced deep learning models. With Merlin Models, you can:
Transformers4Rec
The Transformers4Rec library provides sequential and session-based recommendation.
The library provides modular building blocks that are compatible with standard PyTorch modules.
You can use the building blocks to design custom architectures such as multiple towers, multiple heads and tasks, and losses.
With Transformers4Rec, you can:
Merlin Systems
Merlin Systems provides tools for combining recommendation models with other
elements of production recommender systems like feature stores, nearest neighbor
search, and exploration strategies into end-to-end recommendation pipelines that
can be served with Triton Inference Server. With Merlin Systems, you can:
Merlin Core
Merlin Core provides functionality that is used throughout the Merlin ecosystem.
With Merlin Core, you can:
The simplest way to use Merlin is to run a docker container. NVIDIA GPU Cloud (NGC) provides containers that include all the Merlin component libraries, dependencies, and receive unit and integration testing. For more information, see the Containers page.
To develop and contribute to Merlin, review the installation documentation for each component library. The development environment for each Merlin component is easily set up with conda
or pip
:
A collection of end-to-end examples are available in the form of Jupyter notebooks. The example notebooks demonstrate how to:
These examples are based on different datasets and provide a wide range of real-world use cases.
RAPIDS cuDF
Merlin relies on cuDF for
GPU-accelerated DataFrame operations used in feature engineering.
Dask
Merlin relies on Dask to distribute and scale
feature engineering and preprocessing within NVTabular and to accelerate
dataloading in Merlin Models and HugeCTR.
Triton Inference Server
Merlin leverages Triton Inference Server to provide GPU-accelerated serving for
recommender system pipelines.
To report bugs or get help, please open an issue.