[Plugin] MLflow integration

cosmicBboy commented 2 years ago

The purpose of this task is to come up with a way of using mlflow in a flyte task.

The light-weight integration would simply be to use the mlflow tracking API within a Flyte task to log metrics. Indeed, some of our users are already doing this, and it might make sense to document this in an "MLflow" section in the integration docs.

However, there's an opportunity for a declarative API that would handle some subset of the mlflow logging functions, for example:

This would be beneficial for end users so that they can follow what we think of as "best practices" for how to use MLFlow in Flyte, e.g. not having to think about how to parameterize and name experiments correctly.

For example:

pip install flytekitplugins-mlflow

API Proposal 1: Decorator Plugin

Use the task decorator and/or workflow decorator pattern to create a more seamless experience. This would introduce a new plugin pattern in flytekit, which modifies the underlying function wrapped by @task and @workflow.

Example

import mlflow
import flytekitplugins.mlflow
from flytekit import task, dynamic

@dynamic
@flytekitplugins.mlflow.experiment(
    # args to https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experiment
    name=None  # defaults to "{workflow_name}-{execution_id}" ?
    artifact_location=None,  # defaults to the flytekit location?
    tags=...,
)
def model_experiment(hyperparameter_grid: List[dict]):
    models = []
    data = ...
    for hyperparameters in hyperparameter_grid:
        models.append(train_model(hyperparameters=hyperparameters, data=data))
    ...

@task
@flytekitplugins.mlflow.run(
    # by default, this run will use the parent workflow's mlflow experiment config
    params="hyperparameters",  # log config parameters automatically
    autolog=True,  # enable autologging, could also be a dict of mlflow.autolog args: https://mlflow.org/docs/latest/python_api/mlflow.html#mlflow.autolog
)
def train_model(hyperparameters: dict, data: ...):
    model = MySklearnModel(**hyperparameters)
    ... # fit

    # without autolog=True, users can manually log here
    mlflow.log_metric("key", value)

    return model

API Proposal 2: extend `@task` and `@workflow` arguments

Task config plugins don't really make sense for MLFlow experiment tracking/logging, since the task_config argument is typically used for task types that have specific backend resource requirements (e.g. Spark, Ray, MPI tasks) and is orthogonal to configuring experiments and logging metrics.

Therefore, to support similar functionality to proposal 1, we could introduce additional arguments to the @task and @workflow decorators, e.g.

import mlflow
from flytekitplugins.mlflow import RunConfig, ExperimentConfig
from flytekit import task, workflow

@dynamic(..., logging_config=ExperimentConfig(name=..., artifact_location=..., tags=...)
)
def model_experiment(hyperparameter_grid: List[dict]):
    models = []
    data = ...
    for hyperparameters in hyperparameter_grid:
        models.append(train_model(hyperparameters=hyperparameters, data=data))
    ...

@task(..., logging_config=RunConfig(experiment=..., params=..., autolog=...)
)
def train_model(hyperparameters: dict):
    model = MySklearnModel(**hyperparameters)
    ... # fit

    # without autolog=True, users can manually log here
    mlflow.log_metric("key", value)

    return model

kumare3 commented 2 years ago

I do agree task plugins don't make sense. Is decorator a better pattern? Actually thinking a little more - more native integration might be better 1 especially if we can find commonality with w&b, mlflow and flytedecks

cosmicBboy commented 2 years ago

Actually thinking a little more - more native integration might be better 1

How would that look like?

cosmicBboy commented 2 years ago

If we take inspiration for zenml, they have a simple enable_mlflow decorator:

https://github.com/zenml-io/zenml/tree/main/examples/mlflow_tracking#-how-the-example-is-implemented

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 3 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏

flyteorg / flyte