mlos_bench_service - Githubissues

bpkroth commented 2 months ago

(Storage) APIs to
- [ ] create new Experiments
- how to get/update configs? git URI?
- [ ] set Experiment to runnable
- daemon to start mlos_bench process as child
- [ ] set Experiment to stopped
- stop any existing runner
- see also #687
[ ] new script (mlos_benchd) to manage those actions
- [ ] it would run in a tight loop on the "control VM"
- [ ] as Experiments become runnable in the queue, it would create an mlos_bench process for them and monitor the child process, changing that Experiment state in the database as necessary corresponding to the child process exit code
[ ] notifications on errors and/or monitoring dashboard on Experiment status, interacting mostly with the Storage APIs
- see also #523

bpkroth commented 2 months ago

@eujing

bpkroth commented 2 months ago

May want to split some of these tasks out to separate issues later on

microsoft / MLOS