mle-infrastructure / mle-toolbox

Lightweight Tool to Manage Distributed ML Experiments 🛠
https://mle-infrastructure.github.io/mle_toolbox/toolbox/
MIT License
3 stars 1 forks source link

Monthly clean up of GCS experiments #33

Closed RobertTLange closed 2 years ago

RobertTLange commented 3 years ago

I want to have a helper command that pulls all experiments currently stored in GCS to a hard drive and resets the storage buffer. This helps reduce the cost/mitigates any cost from data storage up to 100GB. Something like sync-gcs-storage that goes through the following steps:

  1. Load new local database of all experiments.
  2. Pulls all experiments from GCS and unzips them.
  3. Deletes all experiments from GCS.
  4. Copy over the local database and give it a timestep.
  5. Delete the local database and reset it.
RobertTLange commented 3 years ago

Automatically do so? Keep a counter/last sync date variable in Protocol DB and automatically pull + clean up?