[ ] new script (mlos_benchd) to manage those actions
[ ] it would run in a tight loop on the "control VM"
[ ] as Experiments become runnable in the queue, it would create an mlos_bench process for them and monitor the child process, changing that Experiment state in the database as necessary corresponding to the child process exit code
[ ] notifications on errors and/or monitoring dashboard on Experiment status, interacting mostly with the Storage APIs
mlos_benchd
) to manage those actionsmlos_bench
process for them and monitor the child process, changing that Experiment state in the database as necessary corresponding to the child process exit code