Closed gfursin closed 2 years ago
We prepared a demo submission for MLPerf inference v2.1 to show that we can automate all steps of the MLPerf submission. We will continue community developments and plan the next release in September.
Following successful validation of CK2 for MLPerf at Student Cluster Competition, we close this ticket and follow the new roadmap here: https://github.com/mlcommons/ck/issues/536
Motivation
This project aims at decomposing MLPerf inference benchmarking into a database of reusable, portable, customizable and deterministic scripts with a unified CLI, common Python API and extensible JSON/YAML meta descriptions using the 2nd generation of the CK framework.
The first goal is to simplify the development of this benchmark, make it easier to extend and run it across continuously changing ML tasks, models, data sets, engines, software and hardware, and automate all the manual steps of the submission process.
The second goal is to enable automatic and continuous design space exploration of ML systems across all ML tasks, models, data set, engines, libraries and platforms based on MLPerf loadgen, and selection of Pareto-optimal configurations based on user constraints (latency, throughput, accuracy, energy, model size, memory usage, device cost, etc).
The third goal is to show researchers and engineers that it is possible to reuse portable ML scripts (to detect, download and install models, data sets, engines, libraries, tools) in their own research projects to avoid reinventing the wheel and use the solid MLPerf benchmarking methodology.
Technology
This project is based on the CK2 automation framework and on our practical experience reproducing 150+ ML and Systems papers and automating MLPerf inference submissions:
CM framework (the 2nd generation of the CK framework aka CK2) is used to organize ML projects as a database of reusable and portable components (tasks, models, datasets, engines, libraries, hardware descriptions): GitHub, motivation paper.
CM automation called "script" is used to wrap native scripts with a unified CLI, Python API and JSON/YAML meta descriptions with a unique ID, list of tags, dependency on other CM scripts and any other information required to make any ad-hoc script reusable, portable, customizable and deterministic: Python automation code
CM scripts to automate detection, download, installation and pre/post-processing of all ML artifacts required to run any ML task on any platform natively or inside containers (models, data sets, engines, libraries, tools ...): Github with current scripts (under community development)
See CM tutorials to learn more about reusable CM scripts and CM database format for ML projects.
This is a part of our CM (CK2) roadmap development for 2022
People
Developers
Feedback
Tasks and timeline
Q3 2022
cm run script --tags=app,image-classification,onnx,python --quiet
Q4 2022 / Q1 2023