flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.82k stars 660 forks source link

[Plugin] Weights and Biases Integration #2798

Open cosmicBboy opened 2 years ago

cosmicBboy commented 2 years ago

Users can already use weights and biases in their Flyte workflows. For example, we have a tutorial that uses W&B for metrics tracking

Similar to the mlflow integration proposal, https://github.com/flyteorg/flyte/issues/2797, it would be worth creating an Integration subsection for W&B, with the simplest case as the first example.

However, there's also an opportunity here for syntactic sugar to help users initialize, configure, and log metrics and artifacts in a similar way to the MLFLow integration proposal so that users can benefit from e.g. tying W&B runs with Flyte metadata like workflow execution ids, etc. so that we can encode "best practices" of how to use Flyte + W&B in our plugin SDK.

API Proposal 1: Decorator Plugin

Use the task decorator and/or workflow decorator pattern to create a more seamless experience. This would introduce a new plugin pattern flytekit, which modifies the underlying function wrapped by @task and @workflow.

Example

import mlflow
import flytekitplugins.wandb
from flytekit import task, dynamic

@dynamic
@flytekitplugins.wandb.experiment(
    # TBD: figure out what experiment-level configurations can be automatically
    # handled by Flyte, e.g. determining a project name that defaults to "{workflow_name}-{execution_id}"
)
def model_experiment(hyperparameter_grid: List[dict]):
    models = []
    data = ...
    for hyperparameters in hyperparameter_grid:
        models.append(train_model(hyperparameters=hyperparameters, data=data))
    ...

@task
@flytekitplugins.wandb.run(
    # TBD: figure out which wandb.init options would make sense here
    # https://docs.wandb.ai/ref/python/init
    # The project name will default to the parent workflow's project name.
)
def train_model(hyperparameters: dict, data: ...):
    # follow the wandb integrations guides based on ML framework of choice:
    # https://docs.wandb.ai/guides/integrations
    model = MySklearnModel(**hyperparameters)
    ... # fit

    wandb.log({"key": value})

    return model

API Proposal 2: extend @task and @workflow arguments

Task config plugins don't really make sense for MLFlow experiment tracking/logging, since the task_config argument is typically used for task types that have specific backend resource requirements (e.g. Spark, Ray, MPI tasks) and is orthogonal to configuring experiments and logging metrics.

Therefore, to support similar functionality to proposal 1, we could introduce additional arguments to the @task and @workflow decorators, e.g.

import mlflow
from flytekitplugins.wandb import RunConfig, ExperimentConfig
from flytekit import task, dynamic

@dynamic(..., logging_config=ExperimentConfig(...))
def model_experiment(hyperparameter_grid: List[dict]):
    models = []
    data = ...
    for hyperparameters in hyperparameter_grid:
        models.append(train_model(hyperparameters=hyperparameters, data=data))
    ...

import mlflow
from flytekitplugins.wandb import RunConfig, ExperimentConfig
from flytekit import task, workflow

@task(..., logging_config=RunConfig(...))
def train_model(hyperparameters: dict):
    model = MySklearnModel(**hyperparameters)
    ... # fit
    return model
cosmicBboy commented 2 years ago

If we take inspiration from zenml, they have a simple enable_wandb interface that will probably work well for us: https://github.com/zenml-io/zenml/tree/main/examples/wandb_tracking#using-wandbsettings

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 4 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏