philschmid / terraform-aws-sagemaker-huggingface

MIT License
46 stars 21 forks source link

Adjusting autoscaling predefined_metric_type to be editable #19

Open chelouche9 opened 1 year ago

chelouche9 commented 1 year ago

Hi @philschmid, I want to use this terraform, however, in my use case I need to deploy falcon40 as an async endpoint with a scaling policy based on the "HasBacklogWithoutCapacity" metric.

In the code implementation at main.tf:

resource "aws_appautoscaling_policy" "sagemaker_policy" {
  count              = local.use_autoscaling
  name               = "${var.name_prefix}-scaling-target-${random_string.ressource_id.result}"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.sagemaker_target[0].resource_id
  scalable_dimension = aws_appautoscaling_target.sagemaker_target[0].scalable_dimension
  service_namespace  = aws_appautoscaling_target.sagemaker_target[0].service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "SageMakerVariantInvocationsPerInstance"
    }
    target_value       = var.autoscaling.scaling_target_invocations
    scale_in_cooldown  = var.autoscaling.scale_in_cooldown
    scale_out_cooldown = var.autoscaling.scale_out_cooldown
  }
}

To solve it for my use case I could replace predefined_metric_type with a var and set it to a default value of "SageMakerVariantInvocationsPerInstance" (I think this is the default in AWS as well).

Do you find it helpful for your repo as well? If not I will fork it and change it only on my repo.

By the way, just wanted to point out that as an ML Engineer using HF and AWS, I see a lot of your content and find it very useful! Thanks for the effort and keep up the great work you are doing!

philschmid commented 1 year ago

Yes happy to include the feature into the repo if you make a PR.