ml-energy / zeus

Deep Learning Energy Measurement and Optimization
https://ml.energy/zeus
Apache License 2.0
221 stars 27 forks source link
deep-learning energy mlsys
Zeus logo

Deep Learning Energy Measurement and Optimization

[![Slack workspace](https://badgen.net/badge/icon/Join%20workspace/b31b1b?icon=slack&label=Slack)](https://join.slack.com/t/zeus-ml/shared_invite/zt-2j5o12jqp-3LtNjgF_uBDTdNcaxWgpdw) [![Docker Hub](https://badgen.net/docker/pulls/symbioticlab/zeus?icon=docker&label=Docker%20pulls)](https://hub.docker.com/r/symbioticlab/zeus) [![Homepage](https://custom-icon-badges.demolab.com/badge/Homepage-ml.energy-23d175.svg?logo=home&logoColor=white&logoSource=feather)](https://ml.energy/zeus) [![Apache-2.0 License](https://custom-icon-badges.herokuapp.com/github/license/ml-energy/zeus?logo=law)](/LICENSE)

Project News

Zeus is a library for (1) measuring the energy consumption of Deep Learning workloads and (2) optimizing their energy consumption.

Zeus is part of The ML.ENERGY Initiative.

Repository Organization

zeus/
├── zeus/             # ⚡ Zeus Python package
│  ├── monitor/       #    - Energy and power measurement (programmatic & CLI)
│  ├── optimizer/     #    - Collection of time and energy optimizers
│  ├── device/        #    - Abstraction layer over CPU and GPU devices
│  ├── utils/         #    - Utility functions and classes
│  ├── _legacy/       #    - Legacy code to keep our research papers reproducible
│  ├── show_env.py    #    - Installation & device detection verification script
│  └── callback.py    #    - Base class for callbacks during training
│
├── zeusd             # 🌩️ Zeus daemon
│
├── docker/           # 🐳 Dockerfiles and Docker Compose files
│
└── examples/         # 🛠️ Zeus usage examples

Getting Started

Please refer to our Getting Started page. After that, you might look at

Docker image

We provide a Docker image fully equipped with all dependencies and environments. Refer to our Docker Hub repository and Dockerfile.

Examples

We provide working examples for integrating and running Zeus in the examples/ directory.

Research

Zeus is rooted on multiple research papers. Even more research is ongoing, and Zeus will continue to expand and get better at what it's doing.

  1. Zeus (2023): Paper | Blog | Slides
  2. Chase (2023): Paper
  3. Perseus (2023): Paper | Blog

If you find Zeus relevant to your research, please consider citing:

@inproceedings{zeus-nsdi23,
    title     = {Zeus: Understanding and Optimizing {GPU} Energy Consumption of {DNN} Training},
    author    = {Jie You and Jae-Won Chung and Mosharaf Chowdhury},
    booktitle = {USENIX NSDI},
    year      = {2023}
}

Other Resources

  1. Energy-Efficient Deep Learning with PyTorch and Zeus (PyTorch conference 2023): Recording | Slides

Contact

Jae-Won Chung (jwnchung@umich.edu)