mle-infrastructure / mle-toolbox

Lightweight Tool to Manage Distributed ML Experiments 🛠
https://mle-infrastructure.github.io/mle_toolbox/toolbox/
MIT License
3 stars 1 forks source link

GCP VM experiment launch #7

Closed RobertTLange closed 3 years ago

RobertTLange commented 3 years ago

Add option to run experiment on GCP. Give default image and resources to launch. Steps should include:

  1. Upload code dir to GCS bucket.
  2. Generate startup script for experiment config.
  3. Create VM using gcloud CLI.
  4. Monitor whether jobs are still running.
  5. Collect all results in GCS and clean up.

Note: Use TPU VM notes/snippets drafted before