eth-easl / modyn

Modyn is a research-platform for training ML models on growing datasets.
MIT License
22 stars 3 forks source link

feat: Implement `EvaluationExecutor` #481

Closed robinholzi closed 1 month ago

robinholzi commented 1 month ago

Motivation

We need parallelization for evaluations in the pipeline executor. Also, we want to implement the post pipeline evaluation.

Changes

To support that we move the evaluation-related code to the dedicated class EvaluationExecutor which handles both the evaluations after and during the pipeline executor. It also supports restarting evaluations after the training pipeline is completed.

MaxiBoether commented 1 month ago

i know this is not ready for review yet but just as a general note: i think stuff like failure recovery and parallelization should be implemented at the evaluator component, not the supervisor. so at some point we'll probably delete the threadpool from the supervisor and just send a couple of requests and the evaluator takes care of correctly handling them

robinholzi commented 1 month ago

partially superseded by https://github.com/eth-easl/modyn/pull/490