mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.76k stars 4.23k forks source link

[FR] Support to custom tags at AWS resources on Sagemaker deploy #5706

Open rafa-am opened 2 years ago

rafa-am commented 2 years ago

Willingness to contribute

The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?

Proposal Summary

Add support to pass custom tags to AWS resources provisioned on endpoint deploy action (using mlflow.sagemaker.deploy interface).

Motivation

What component(s), interfaces, languages, and integrations does this feature affect?

Components

Interfaces

Languages

Integrations

Details

The current mlflow.sagemaker.deploy(...) interface (link) doesn't offer a way to customize tags on AWS resources associated to Sagemaker endpoint deployment.

For example, the deployment process calls a lot of boto3 APIs and almost all their arguments accept tags parameter customization:

[1] at _upload_s3(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1225

[2] at _create_sagemaker_model(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1661-L1666

(in case, specific and static tags [{"Key": "model_uri", "Value": str(model_uri)}] are passed to sage_client.create_model)

[3] at _create_sagemaker_endpoint(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1448-L1452

(in case, specific and static tags [{"Key": "app_name", "Value": endpoint_name}] are passed to sage_client.create_endpoint_config)

and

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1457-L1461

(in case, empty tags [] are passed to sage_client.create_endpoint).

The idea is adding a tag custom input parameter to mlflow.sagemaker.deploy(..., custom_tags=...) signature. That tags could be joined to that currently in use and passed to boto3 API calls too.

A possible strategy could define a dictionary structure to address specific tags to specific resources (or APIs). Otherwise, a unique set of tags could be applied to all ones.

{
  "Tags": {
    "model_object_tags": [{...}],
    "endpoint_config_tags": [{...}],
    "endpoint_tags": [{...}],
    ...
  }
}
{
  "Tags": {[...]}
}

Of course, the custom tags would be optional and a format checker required. Additional adjustments would be necessary to deploy CLI command (link).


The proposed feature could inspire an extensible approach to other interfaces of the module to which custom tags can be applied.


dbczumar commented 2 years ago

@rafa-am Thank you for filing this thorough, detailed feature request. We'd be happy to review a pull request for this feature. Can we add tag specification support to the SageMakerDeploymentClient (https://mlflow.org/docs/latest/python_api/mlflow.sagemaker.html#mlflow.sagemaker.SageMakerDeploymentClient) as part of the config dictionary?

We're planning to deprecate mlflow.sagemaker.deploy() soon in favor of SageMakerDeploymentClient, sinceSageMakerDeploymentClient` conforms to the MLflow deployments API - https://mlflow.org/docs/latest/models.html#deployment-to-custom-targets.

rafa-am commented 2 years ago

Of course! Make a lot of sense. Thank u for feedback, @dbczumar.

I'm using a mlflow version older than 1.24 and have not yet contact to this SageMakerDeploymentClient. I'll check it and work on the pull request.

dbczumar commented 2 years ago

Hi @rafa-am, any updates here?

cdreetz commented 2 months ago

is this basically done? @dbczumar @BenWilson2

looks like tags are supported for sagemaker client now

https://github.com/mlflow/mlflow/blob/e5d1280a467f0b51692ff3edf29e191fc52f36ef/mlflow/sagemaker/__init__.py#L1401