Help students run MLPerf inference at the Student Cluster Competition'24

We were asked to help students run MLPerf inference benchmark at the Student Cluster Competition'24 and automate their submission and grading via the MLCommons CM automation framework.

Current plan is to use MLPerf inference Stable Diffusion benchmark with Stability AI’s Stable Diffusion XL model (2.6 billion parameters) and COCO data set. This popular model is used to create compelling images through a text-based prompt.

We must check the following:

[ ] Check current CM workflows to run reference MLPerf SD benchmark
[ ] Check CM workflows to run optimized MLPerf SD benchmark v4.0
- [ ] Intel
- [ ] Nvidia
[ ] Check if support for AMD GPUs can be provided
[ ] Check how to support multi-node inference
[ ] Prepare tutorial about MLPerf, loadgen, this benchmark and CM
[ ] Check MLCommons Croissant format for the dataset?
[ ] Automate submission and grading
- [ ] Need to agree how to report accuracy
- [ ] We may train a smaller model to analyze produced images
- [ ] Create live scoreboard (W&B?)

mlcommons / cm4mlops

Help students run MLPerf inference at the Student Cluster Competition'24 #26