ml-energy / zeus

Deep Learning Energy Measurement and Optimization
https://ml.energy/zeus
Apache License 2.0
180 stars 24 forks source link

Extending ZeusDataLoader to multi-GPU single-node #2

Closed Rosie-m closed 1 year ago

Rosie-m commented 1 year ago

Description

This pull request is to extend ZeusDataLoader from single-GPU to single-node multi-GPU, to support distributed data-parallel training.

Implementation

Important changes to the original framework

Launching methods for train.py

Closes #4

jaywonchung commented 1 year ago

Thank you for your work! I'll review sometime tomorrow.