[FR] Support to custom tags at AWS resources on Sagemaker deploy

rafa-am commented 2 years ago

Willingness to contribute

The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?

[x] Yes. I can contribute this feature independently.
[ ] Yes. I would be willing to contribute this feature with guidance from the MLflow community.
[ ] No. I cannot contribute this feature at this time.

Proposal Summary

Add support to pass custom tags to AWS resources provisioned on endpoint deploy action (using mlflow.sagemaker.deploy interface).

Motivation

What is the use case for this feature? To governance policy propose, it's essential to adopt a good set of tags on cloud resources. In particular cases, it's mandatory, and the actual mlflow interface doesn't enable to apply that tags at AWS provider.
Why is this use case valuable to support for MLflow users in general? I think using right and fully tags is fundamentally a good practice and should be incorporated in our recurrent deployments. Every so often we avoid because the process isn't easy.
Why is this use case valuable to support for your project(s) or organization? My organization is applying a governance policy which requires that every deployed resource at AWS should go up with mandatory tags. Otherwise, the policy blocks the deployment of the resource on the fly. Nowadays, we are deploying Sagemaker endpoints using mlflow.sagemaker.deploy interface (because it facilitates our lives) but applying custom tags after the procedure end.
Why is it currently difficult to achieve this use case? (please be as specific as possible about why related MLflow features and components are insufficient) To customize tags, I need to call the same boto3 APIs that mlflow.sagemaker.deploy calls just after the deployment process finishes. That APIs already bring native support to tag fields on interface arguments.

What component(s), interfaces, languages, and integrations does this feature affect?

Components

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[ ] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/projects: MLproject format, project running backends
[x] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

Languages

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

Integrations

[ ] integrations/azure: Azure and Azure ML integrations
[x] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

Details

The current mlflow.sagemaker.deploy(...) interface (link) doesn't offer a way to customize tags on AWS resources associated to Sagemaker endpoint deployment.

For example, the deployment process calls a lot of boto3 APIs and almost all their arguments accept tags parameter customization:

[1] at _upload_s3(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1225

[2] at _create_sagemaker_model(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1661-L1666

(in case, specific and static tags [{"Key": "model_uri", "Value": str(model_uri)}] are passed to sage_client.create_model)

[3] at _create_sagemaker_endpoint(...):

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1448-L1452

(in case, specific and static tags [{"Key": "app_name", "Value": endpoint_name}] are passed to sage_client.create_endpoint_config)

and

https://github.com/mlflow/mlflow/blob/8bc6b36150f1dd11f5c550c4aba9a3003bc4cc85/mlflow/sagemaker/__init__.py#L1457-L1461

(in case, empty tags [] are passed to sage_client.create_endpoint).

The idea is adding a tag custom input parameter to mlflow.sagemaker.deploy(..., custom_tags=...) signature. That tags could be joined to that currently in use and passed to boto3 API calls too.

A possible strategy could define a dictionary structure to address specific tags to specific resources (or APIs). Otherwise, a unique set of tags could be applied to all ones.

{
  "Tags": {
    "model_object_tags": [{...}],
    "endpoint_config_tags": [{...}],
    "endpoint_tags": [{...}],
    ...
  }
}

{
  "Tags": {[...]}
}

Of course, the custom tags would be optional and a format checker required. Additional adjustments would be necessary to deploy CLI command (link).

The proposed feature could inspire an extensible approach to other interfaces of the module to which custom tags can be applied.

dbczumar commented 2 years ago

@rafa-am Thank you for filing this thorough, detailed feature request. We'd be happy to review a pull request for this feature. Can we add tag specification support to the SageMakerDeploymentClient (https://mlflow.org/docs/latest/python_api/mlflow.sagemaker.html#mlflow.sagemaker.SageMakerDeploymentClient) as part of the config dictionary?

We're planning to deprecate mlflow.sagemaker.deploy() soon in favor of SageMakerDeploymentClient, sinceSageMakerDeploymentClient` conforms to the MLflow deployments API - https://mlflow.org/docs/latest/models.html#deployment-to-custom-targets.

rafa-am commented 2 years ago

Of course! Make a lot of sense. Thank u for feedback, @dbczumar.

I'm using a mlflow version older than 1.24 and have not yet contact to this SageMakerDeploymentClient. I'll check it and work on the pull request.

dbczumar commented 2 years ago

Hi @rafa-am, any updates here?

cdreetz commented 3 months ago

is this basically done? @dbczumar @BenWilson2

looks like tags are supported for sagemaker client now

https://github.com/mlflow/mlflow/blob/e5d1280a467f0b51692ff3edf29e191fc52f36ef/mlflow/sagemaker/__init__.py#L1401

mlflow / mlflow