NVIDIA-Merlin / Merlin

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Apache License 2.0
715 stars 111 forks source link

[RMP] Benchmarking Session-Based Models #806

Open bschifferer opened 1 year ago

bschifferer commented 1 year ago

Problem:

We want to benchmark session-based (transformer-based) architectures in respect of speed-up, costs, inference, latency, etc. to provide guidance to our community.

Goal:

Provide guidance to our community about the performance (computational) and costs oof transformer-based models for training and inference.

Starting Point:

Let's start with inference.

Background [ ] Define experiments: Which dataset, which architecture, which hyperparameters (e.g. sequence length, etc.)

Inference What questions do we want to answer:

Transformer4Rec (PyTorch) [x] Benchmark Inference of Transformer4Rec model without NVTabular (Python Model) like this example Ticket: https://github.com/NVIDIA-Merlin/Transformers4Rec/issues/610 [ ] Benchmark Inference of Transformer4Rec model without NVTabular (TorchScript Model) like this example

Merlin Models (TensorFlow) [ ] Benchmark Inference for REES46 eCoommerce

Training TBD

We should use JMeter for load testing

bschifferer commented 1 year ago

A detailed view is available here: https://docs.google.com/document/d/1g5FUrdhZQzef1OWwiQLfNNGdHr4a71Cr-jqndl-SoQg/edit#

bschifferer commented 1 year ago

Collecting results in a google spreadsheet (details) + some slides as a summary