mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.58k stars 549 forks source link

Add MLCube support for Image Segmentation Benchmark #494

Open davidjurado opened 3 years ago

davidjurado commented 3 years ago

Used PR #465 and #491 as references.

Current implementation

We'll be updating this section as we merge MLCube PRs and make new MLCube releases.

Benchmark execution with MLCube

Project setup

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker

# Fetch the image segmentation workload
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/494/head:feature/mlcube_image_segmentation && git checkout feature/mlcube_image_segmentation
cd ./image_segmentation/mlcube

Dataset

The KiTS19 dataset will be downloaded and processed. Sizes of the dataset in each step:

Dataset Step MLCube Task Format Size
Download (raw dataset) download_data nii.gz ~29 GB
Preprocess (Processed dataset) preprocess_data npy ~31 GB
Total (After all tasks) All ~60 GB

Tasks execution

# Download KiTS19 dataset. Default path = mlcube/workspace/data
# To override it, use data_dir=DATA_DIR
mlcube run --task download_data

# Preprocess KiTS19 dataset
# It will use a subdirectory from the DATA_DIR path defined in the previous step
mlcube run --task preprocess_data

# Run benchmark. Default paths input_dir = mlcube/workspace/processed_data
# Parameters to override: input_dir=DATA_DIR, output_dir=OUTPUT_DIR, parameters_file=PATH_TO_TRAINING_PARAMS
mlcube run --task train

We are targeting pull-type installation, so MLCube images should be available on docker hub. If not, try this:

mlcube run ... -Pdocker.build_strategy=always

We are targeting pull-type installation, so MLCube images should be available on docker hub. If not, try this:

mlcube run ... -Pdocker.build_strategy=always
github-actions[bot] commented 3 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

davidjurado commented 3 years ago

One thing I noticed is that when running the command mlcube describe the generated files are not getting updated with the new instructions from the mlcube/workspace/.mlcube.yaml file.

nv-rborkar commented 4 months ago

@davidjurado is the issue you observed resolved now?