mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.57k stars 548 forks source link

MLCube: Recommendation benchmark #510

Closed davidjurado closed 1 year ago

davidjurado commented 2 years ago

Benchmark execution with MLCube

Project setup

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker

# Fetch the image segmentation workload
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/510/head:feature/mlcube_recommendation && git checkout feature/mlcube_recommendation
cd ./recommendation/mlcube

Dataset

The MovieLens dataset will be downloaded and processed. Sizes of the dataset in each step:

Dataset Step MLCube Task Format Size
Download (raw dataset) download_data .tar ~3.1 GB
Extract (extracted dataset) download_data *.npz ~3.1 GB
Total (After all tasks) All ~6.2 GB

Tasks execution

# Download KiTS19 dataset. Default path = mlcube/workspace/data
# To override it, use data_dir=DATA_DIR
mlcube run --task download_data

# Preprocess KiTS19 dataset
# It will use a subdirectory from the DATA_DIR path defined in the previous step
mlcube run --task preprocess_data

# Run benchmark. Default paths input_dir = mlcube/workspace/processed_data
# Parameters to override: input_dir=DATA_DIR, output_dir=OUTPUT_DIR, parameters_file=PATH_TO_TRAINING_PARAMS
mlcube run --task train

We are targeting pull-type installation, so MLCube images should be available on docker hub. If not, try this:

mlcube run ... -Pdocker.build_strategy=always
github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

johntran-nv commented 1 year ago

@davidjurado I'm tempted to close this since we're going to replace DLRM in v3.0. Is that ok with you?

matthew-frank commented 1 year ago

This seems to be a modification to the long-retired NCF benchmark, rather than the current DLRM version of the recommendation benchmark.

In an effort to do a better job maintaining this repo, we're closing PRs for retired benchmarks. The old benchmark code still exists, but has been moved to https://github.com/mlcommons/training/tree/master/retired_benchmarks/ncf.

If you think there is useful cleanup to be done to the retired_benchmarks subtree, please submit a new PR.