mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.6k stars 553 forks source link

MLCube implementation for Stable Diffusion #696

Open davidjurado opened 8 months ago

davidjurado commented 8 months ago

MLCube for Stable Diffusion

MLCube™ GitHub repository. MLCube™ wiki.

Project setup

An important requirement is that you must have Docker installed.

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install pip==24.0
pip install mlcube-docker
# Fetch the implementation from GitHub
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/696/head:feature/mlcube_sd && git checkout feature/mlcube_sd
cd ./stable_diffusion/mlcube

Inside the mlcube directory run the following command to check implemented tasks.

mlcube describe

MLCube tasks

Download dataset.

mlcube run --task=download_data

Download models.

mlcube run --task=download_models

Train.

mlcube run --task=train

Here is a video explaining the demo steps:

IMAGE ALT TEXT HERE

Download demo dataset.

mlcube run --task=download_demo

Download models.

mlcube run --task=download_models

Train demo.

mlcube run --task=demo

Execute the complete pipeline

You can execute the complete pipeline with one single command.

mlcube run --task=download_data,download_models,train

Tested in an Nvidia A100 (40G)

mlcube run --task=download_demo,download_models,demo

Note: To rebuild the image use the flag: -Pdocker.build_strategy=always during the mlcube run command.

github-actions[bot] commented 8 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅