DougTrajano / mlflow-server

MLflow Tracking Server with basic auth deployed in AWS App Runner.
https://gallery.ecr.aws/t9j8s4z8/mlflow
Apache License 2.0
34 stars 17 forks source link
aws machine-learning mlflow mlflow-docker

MLflow with basic auth

This project is ARCHIVED and it will not receive maintenance anymore because the MLflow finally supports basic authentication natively! \o/

version: "3.9"

services:
  mlflow:
    image: ghcr.io/mlflow/mlflow:v2.5.0
    ports:
      - 5000:5000
    command: mlflow server --host 0.0.0.0 --app-name basic-auth

Find further details in MLflow Authentication — MLflow 2.5.0 documentation.


AWS Heroku

A dockerized MLflow Tracking Server with basic auth (username and password).

You will have three options to deploy the server: AWS, Heroku, and local.

We provide a Terraform stack that can be easily used to deploy the MLflow Tracking Server.

NOTE: This project is not intended to be used for production deployments. It is intended to be used for testing and development.

Environment Variables

The environment variables below are required to deploy this project.

Variable Description Default
PORT Port for the MLflow server 80
MLFLOW_ARTIFACT_URI S3 Bucket URI for MLflow's artifact store "./mlruns"
MLFLOW_BACKEND_URI SQLAlchemy database uri (if provided, the other variables MLFLOW_DB_* are ignored)
DATABASE_URL SQLAlchemy database uri, it's used by Heroku deployment. Basically, we will move it to MLFLOW_BACKEND_URI.
MLFLOW_DB_DIALECT Database dialect (e.g. postgresql, mysql+pymysql, sqlite) "postgresql"
MLFLOW_DB_USERNAME Backend store username "mlflow"
MLFLOW_DB_PASSWORD Backend store password "mlflow"
MLFLOW_DB_HOST Backend store host
MLFLOW_DB_PORT Backend store port 3306
MLFLOW_DB_DATABASE Backend store database "mlflow"
MLFLOW_TRACKING_USERNAME Username for MLflow UI and API "mlflow"
MLFLOW_TRACKING_PASSWORD Password for MLflow UI and API "mlflow"

Deploying MLflow Tracking Server

AWS

Amazon ECR

[Amazon Elastic Container Registry (ECR)](https://aws.amazon.com/ecr/) is a fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts anywhere.

App Runner

[AWS App Runner](https://aws.amazon.com/apprunner/) is a fully managed service that makes it easy for developers to quickly deploy containerized web applications and APIs, at scale and with no prior infrastructure experience required. Start with your source code or a container image.

Amazon S3

[Amazon Simple Storage Service (Amazon S3)](https://aws.amazon.com/s3/) is an object storage service that offers industry-leading scalability, data availability, security, and performance.

Amazon Aurora Serverless

[Amazon Aurora Serverless](https://aws.amazon.com/rds/aurora/serverless/) is an on-demand, auto-scaling configuration for Amazon Aurora. It automatically starts up, shuts down, and scales capacity up or down based on your application's needs. You can run your database on AWS without managing database capacity.

Prerequisites

To deploy MLflow, you'll need to:

  1. Create an AWS account if you don't already have one.

  2. Configure AWS CLI to use your AWS account.

  3. Clone this repository.

git clone https://github.com/DougTrajano/mlflow-server.git
  1. Open mlflow-server/terraform folder.
cd mlflow-server/terraform
  1. Run the following command to create all the required resources:
terraform init
terraform apply -var mlflow_username="YOUR-USERNAME" -var mlflow_password="YOUR-PASSWORD"

Multiple usernames and passwords can also be specified in a comma-delimited string:

terraform apply -var mlflow_username="USERNAME1,USERNAME2,USERNAME3" -var mlflow_password="PASSWORD1,PASSWORD2,PASSWORD3"

See a full list of variables that can be used in terraform/variables.tf.

  1. Type "yes" when prompted to continue.
Plan: 21 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + artifact_bucket_id = (known after apply)
  + mlflow_password    = (sensitive value)
  + mlflow_username    = "doug"
  + service_url        = (known after apply)
  + status             = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

This will create the following resources:

Heroku

Prerequisites

  1. Create an AWS account if you don't already have one.

  2. Configure AWS CLI to use your AWS account.

  3. Clone this repository.

git clone https://github.com/DougTrajano/mlflow-server.git
  1. Open mlflow-server/terraform folder.
cd mlflow-server/terraform
  1. Run the following command to create only the S3 bucket
terraform init
terraform apply -var environment="heroku" -target="module.s3"
  1. Type "yes" when prompted to continue.
Plan: 5 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + artifact_bucket_id = (known after apply)

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes
  1. Create an IAM Policy for the S3 bucket as follows:
IAM Policy example

```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": "arn:aws:s3:::mlflow-heroku-20220723133820303500000001" }, { "Effect": "Allow", "Action": [ "s3:*", "s3-object-lambda:*" ], "Resource": "arn:aws:s3:::mlflow-heroku-20220723133820303500000001/*" } ] } ```

  1. Create an IAM User and attach the IAM Policy previously created.

Take note of the IAM User access key and secret key, you'll need them in the step 5.

  1. Click on the "Deploy to Heroku" button below.

Deploy

  1. Follow the instructions on the new page to create an MLflow Tracking Server.

Local

Prerequisites

  1. Clone this repository.
git clone https://github.com/DougTrajano/mlflow-server.git
  1. Open the mlflow-server folder.
cd mlflow-server
  1. Run the following command to create all the required resources:
docker-compose up -d --build

Using your deployed MLflow

The link that you will use to access the MLflow Tracking Server will depend on the deployment method you choose.

Also, you can track your experiments using MLflow API.

import os
import mlflow

os.environ["MLFLOW_TRACKING_URI"] = "<<YOUR-MLFLOW-TRACKING-URI>>"
os.environ["MLFLOW_EXPERIMENT_NAME"] = "<<YOUR-EXPERIMENT-NAME>>"
os.environ["MLFLOW_TRACKING_USERNAME"] = "<<YOUR-MLFLOW-USERNAME>>"
os.environ["MLFLOW_TRACKING_PASSWORD"] = "<<YOUR-MLFLOW-PASSWORD>>"

# AWS AK/SK are required to upload artifacts to S3 Bucket
os.environ["AWS_ACCESS_KEY_ID"] = "<<AWS-ACCESS-KEY-ID>>"
os.environ["AWS_SECRET_ACCESS_KEY"] = "<<AWS-SECRET-ACCESS-KEY>>"

SEED = 1993

mlflow.start_run()
mlflow.log_param("seed", SEED)
mlflow.end_run()

References