NVIDIA-Merlin / models

Merlin Models is a collection of deep learning recommender system model reference implementations

https://nvidia-merlin.github.io/models/main/index.html

Apache License 2.0

262 stars 50 forks source link

[Task] Profiling the training&evaluation pipeline for retrieval and ranking models #340

Open gabrielspmoreira opened 2 years ago

gabrielspmoreira commented 2 years ago

🚀 Feature request

Profiling the training&evaluation pipeline for retrieval and ranking models and finding bottlenecks that should be solved for better performance and GPU utilization.

For performing this task, the research scripts for training and evaluating retrieval and ranking models will be used on the TenRec dataset. The instructions for setting up the experiments (container, dataset download, and how to run the scripts) is available in this document.

viswa-nvidia commented 2 years ago

@gabrielspmoreira , is this a bug or a feature request. The description says steps to reproduce bug. Please clarify.

viswa-nvidia commented 2 years ago

@gabrielspmoreira to co-ordinate with Valerie Sarge for definition

gabrielspmoreira commented 2 years ago

We are observing a slower runtime in the script with V2 API of retrieval models compared to V1. @sararb is currently finishing investigating the V2 implementation in PR #790 and is trying to isolate which block(s) is/are introducing the slowness. The profiling task could start after we make sure the V2 implementation is at least as fast as V1.

gabrielspmoreira commented 1 year ago

I have provided the instructions for setting up the profiling environment for @vysarge in this document.