canonical / data-science-stack

Stack with machine learning tools needed for local development.
Apache License 2.0
11 stars 3 forks source link

Implement initialise CLI command #12

Open misohu opened 8 months ago

misohu commented 8 months ago

Why it needs to get done

DSS needs a command to connect Juju (from within the snap) to Microk8s running on the local machine. For this task we may assume that both executables are available at local machine.

What needs to get done

Implement a CLI command dss initialise which connects juju to microk8s based on config speciffied in env variable KUBECONFIG or based on flag --kubeconfig. Initialisation will consist of:

bundle: kubernetes
name: dss
applications:
  admission-webhook:
    charm: admission-webhook
    channel: 1.8/stable
    trust: true
    scale: 1
    _github_repo_name: admission-webhook-operator
    _github_repo_branch: track/1.8
  mlflow-minio:
    charm: minio
    channel: ckf-1.7/stable
    scale: 1
    trust: true
    _github_repo_name: minio-operator
  mlflow-mysql:
    charm: mysql-k8s
    channel: 8.0/stable
    scale: 1
    trust: true
    _github_repo_name: mysql-k8s-operator
  mlflow-server:
    charm: mlflow-server
    channel: 2.1/stable
    scale: 1
    trust: true
    _github_repo_name: mlflow-operator
  jupyter-controller:
    charm: jupyter-controller
    channel: latest/edge
    scale: 1
    trust: true
    _github_repo_name: notebook-operators
    options:
      use-istio: false
relations:
  - [mlflow-server, mlflow-minio]
  - [mlflow-server, mlflow-mysql]

Example bash script from demo:

juju bootstrap my-k8s uk8s-controller
juju add-model kubeflow

# Deploy charms
juju deploy dss --trust

juju wait-for application mlflow-server --query='name=="mlflow-server" && (status=="active" || status=="idle")' --timeout=15m0s
juju wait-for application mlflow-minio --query='name=="mlflow-minio" && (status=="active" || status=="idle")' --timeout=15m0s
juju wait-for application jupyter-controller --query='name=="jupyter-controller" && (status=="active" || status=="idle")' --timeout=15m0s

When is the task considered done

syncronize-issues-to-jira[bot] commented 8 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5171.

This message was autogenerated

misohu commented 7 months ago

After discussion with the team we decided to go with design where we don't use juju and charms to deploy notebook-server and MLflow stack. In the newest design DSS components are deployed with plain Kubernetes objects (deployments + services + pvcs). DSS will also deploy MLflow with local mode. This means we no longer need MySQL and MinIO as everything will be stored in the folders.

The initialize command will also accept optional --default-image parameter which may specify image to be used for the default user-notebook.

Because of this dss initialise will:

When is the task considered done

  1. User can run

    dss initialise --kubeconfig=/path/to/kubeconfig

    with output:

    Access the notebook at http://10.152.183.223/notebook/user-namespace/user-notebook/
    Access MLflow ui at: http://10.152.183.34:5000

    the above URLs are accessible by user and user can talk to MLflow server from inside the notebook.

  2. User can override --default-image parameter in dss initialise to change the default notebook image used for DSS notebook.