hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.61k stars 8.99k forks source link

[New Resource]: AWS SageMaker MLFlow Tracking Server #38055

Open philipgebus opened 1 week ago

philipgebus commented 1 week ago

Description

AWS SageMaker has recently released experiments tracking via a MLFlow Tracking Server.

Requested Resource(s) and/or Data Source(s)

aws_sagemaker_mlflow_tracking_server

Potential Terraform Configuration

resource "aws_sagemaker_mlflow_tracking_server" "this" {
  tracking_server_name = "mlflow-dev"
  artifact_store_uri = "s3://bucket/prefix/"

  tracking_server_size = "Small|Medium|Large"
  mlflow_version = "2.13.2"

  role_arn = "arn:aws:iam::000000000000:role/tracking-server-role"

  automatic_model_registration = true
  weekly_maintenance_window_start = "TUE:03:30"

  tags = {
    sagemaker:user-profile-arn           = "arn:aws:sagemaker:eu-central-1:0000000000:user-profile/d-***/johndoe",
    sagemaker:domain-arn     = aws_sagemaker_domain.mlflow.arn
  }
}

References

AWS blog post: https://aws.amazon.com/blogs/aws/manage-ml-and-generative-ai-experiments-using-amazon-sagemaker-with-mlflow

Code reference: https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/sagemaker#Client.CreateMlflowTrackingServer https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker/client/create_mlflow_tracking_server.html

Would you like to implement a fix?

No

github-actions[bot] commented 1 week ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

jmeisele commented 6 days ago

How do you think this would this work in a centralized model registry located in another AWS Account? Would the MLFlow server just be located in the same AWS Account as the model registry, then the MLFlow tracking server be referenced in each related sagemaker domain(s) in dev, qa, prod?

philipgebus commented 6 days ago

How do you think this would this work in a centralized model registry located in another AWS Account? Would the MLFlow server just be located in the same AWS Account as the model registry, then the MLFlow tracking server be referenced in each related sagemaker domain(s) in dev, qa, prod?

In my preferred solution the MLFlow tracking server operates within a centralized model registry AWS account and is accessed as mentioned by SageMaker domains located in various AWS accounts. However, it seems that the SageMaker MLFlow tracking server does not (yet?) support IAM resource policies, which would facilitate convenient cross-account access. Hence, one would either need to utilize sts:AssumeRole (SageMaker job role assumes a mlflow role in the central model registry account before logging training runs) or come up with a custom synchronization approach in order to work in a cross-account setup.

DrFaust92 commented 2 days ago

Thinking of working on this, talking to provider team on how to tackle it as new resources require go sdk v2 and its not yet supported with sagemaker in the provider yet AFAIK and need to understand how much work this is 😸