ml-energy / zeus

Deep Learning Energy Measurement and Optimization
https://ml.energy/zeus
Apache License 2.0
180 stars 24 forks source link

`BatchSizeOptimizer` server and client #19

Closed jaywonchung closed 2 months ago

jaywonchung commented 9 months ago

Optimizing the batch size of training ought to be a server because it has to optimize across multiple recurrences of the job. Then, BatchSizeOptimizer (under zeus.optimizer.batch_size) will act as the client for the server, and will be integrated into the user's training script.