odahu / odahu-flow

Apache License 2.0
12 stars 2 forks source link

As a DS I would like to auto schedule regular training/packaging/deployment for models for which I have already manually run training, packaging & deployment #642

Open alinaignatiuk opened 2 years ago

alinaignatiuk commented 2 years ago

Business context: Once the data scientist has determined the best model, trained, packaged and deployed on ODAHU k8s cluster this model is going to be used by external services. However, over time the data updates are coming and the model should be retrained periodically with respect to the new data sets. In most cases nothing should change except new cases appear in the data sets. Based on this the model should be retrained, repackaged and redeployed to provide more accurate predictions and results based on the latest data.

Use case: Auto scheduling for training/packaging/deployment

Design: ODAHU UI (feature available over ODAHU UI)

Acceptance criteria:

  1. User should be able to activate auto scheduler for models which have been already manually trained, packaged and deployed in ODAHU
  2. User should indicate
    • trained model ID,
    • packaged model ID,
    • deployed model ID and
    • auto scheduler parameters
  3. Auto scheduler parameters:
    • on/off
    • start date (mandatory)
    • end date (optional, if not indicated, then it will run forever)
    • start time (mandatory)
    • frequency (daily, weekly, bi-weekly, monthly)
    • day(s) of the week
    • time zone (time as per local time zone, converted to UTC) - informational field
  4. User should be able to pick up the trainer model ID, packaged model ID and deployed model IDs from the lists
  5. User should be able to see the list of auto scheduled models with current status On/Off/Running(?)
  6. For each run of auto scheduled training, packaging and deployment the system should save information into the registry to about auto scheduled models training, packaging and deployment runs (discuss)

Dependency:

  1. Within the training, packaging and deployment model lists would be good to recognize those models that have auto scheduler active

Decomposition: We need to provide rough estimate and order of the tasks

ODAHU UI:

  1. Auto scheduled models list
  2. Scheduler parameters form
  3. Update training, packaging & deployment lists respectively for AC.6

ODAHU back-end:

  1. Create BD
  2. Create API
  3. Create separate entity for Scheduler
  4. Run the model training as per the existing parameters and data for this particular model ID
  5. Update training artifact in the packaging input parameters
  6. Update packaging artifact in the deployment input parameters
  7. Create logic
  8. Check whether we have functions which create Trained, Packaged and Deployed models IDs lists (3 lists)