mlcommons / ck

Collective Mind (CM) is a small, modular, cross-platform and decentralized workflow automation framework with a human-friendly interface and reusable automation recipes to make it easier to build, run, benchmark and optimize AI, ML and other applications and systems across diverse and continuously changing models, data, software and hardware
https://access.cKnowledge.org/challenges
Apache License 2.0
595 stars 109 forks source link

Suggestions to improve CM-MLPerf inference after SCC'23 #1006

Closed gfursin closed 6 months ago

gfursin commented 9 months ago

While finalizing CM-MLPerf BERT inference benchmark tutorial for SCC'23 here a few missing things that we can do later if/when we have time and resources:

Save more info to the SCC'23 dashboard: https://wandb.ai/cmind/cm-mlperf-scc23-bert-offline/workspace

gfursin commented 9 months ago

More feedback from SCC'23 teams about CM-MLPerf automation:

gfursin commented 9 months ago

Another idea is to stop CM just before running the command and show how we prepared the command line and which cache entries are used (with inference sources, model, dataset, loadgen, etc) ... This is what --debug flag already does but we may mention it explicitly.

Also, we should simplify tutorial and only describe the CM command to run full MLPerf inference benchmark out-of-the-box with all the flags that can customize execution - we plan to prepare a higher-level CM-MLPerf launcher to do that.

gfursin commented 9 months ago

Yet another suggestion is to make it clear that CM is not a "reproducibility" tool but a workflow automation tool that attempts adapt benchmark/applications to continuously changing software and hardware. When combined with Docker, we can ensure that we can run a benchmark (even though it doesn't guarantee reproducibility of performance numbers since we may not have a control to pin threads to specific cores, control NUMA, set frequency of different cores, etc). But CM users want to run MLPerf with new software/hardware even if it fails, because we can then collaboratively improve CM scripts and make them more portable and reproducible as a community effort - we should highlight it in the tutorial!

gfursin commented 6 months ago

We added support for most features from here in CM v2+. I am closing this ticket and we will open a new one with a few pending features if needed.