Support for dynamic quota in azurerm_cognitive_deployment

hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs

Mozilla Public License 2.0

4.52k stars 4.6k forks source link

Support for dynamic quota in azurerm_cognitive_deployment #23988

Open rmoesbergen opened 9 months ago

rmoesbergen commented 9 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Community Note

Please vote on this issue by adding a :thumbsup: reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment and review the contribution guide to help.

Description

A azure cognitive services deployment now supports dynamic scaling of quota when capacity is available in the account. Please add this setting to the azurerm_cognitive_deployment terraform resource so it can be auto-provisioned. (The setting is called "Dynamic Quota" in the UI:

New or Affected Resource(s)/Data Source(s)

azurerm_cognitive_deployment

Potential Terraform Configuration

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_cognitive_account" "example" {
  name                = "example-ca"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  kind                = "OpenAI"
  sku_name            = "S0"
}

resource "azurerm_cognitive_deployment" "example" {
  name                 = "example-cd"
  cognitive_account_id = azurerm_cognitive_account.example.id
  model {
    format  = "OpenAI"
    name    = "text-curie-001"
    version = "1"
  }

  scale {
    type = "Standard"
    capacity = "10" # K Transactions per minute
    dynamic = true  # <---- this would be the new setting
  }
}

References

https://microsoftlearning.github.io/mslearn-openai/Instructions/Labs/01-get-started-azure-openai.html#deploy-a-model

rcskosir commented 9 months ago

Thank you for taking the time to open this feature request!

unique-dominik commented 8 months ago

Good request 🍻

I observe that if you toggle this from the portal it sends:

{"displayName":"gpt-35-turbo-16k","sku":{"name":"Standard","capacity":240},"properties":{"model":{"format":"OpenAI","version":"0613","name":"gpt-35-turbo-16k"},"versionUpgradeOption":"NoAutoUpgrade","dynamicThrottlingEnabled":true,"raiPolicyName":"Microsoft.Nil"}}

and backward

{"displayName":"gpt-35-turbo-16k","sku":{"name":"Standard","capacity":240},"properties":{"model":{"format":"OpenAI","version":"0613","name":"gpt-35-turbo-16k"},"versionUpgradeOption":"NoAutoUpgrade","dynamicThrottlingEnabled":false,"raiPolicyName":"Microsoft.Nil"}}

Notice that the dynamicThrottlingEnabled flips.

Interestingly, on the cognitive_account there is a dynamic_throttling_enabled.

I try out later if these are equal or not 👀 I suspect no as accounts/create#dynamicThrottlingEnabled has its own dynamicThrottlingEnabled vs deployments/create#dynamicThrottlingEnabled

If we are lucky, they get inherited 🤣

illgitthat commented 8 months ago

Interestingly, on the cognitive_account there is a dynamic_throttling_enabled.

I try out later if these are equal or not 👀 I suspect no as accounts/create#dynamicThrottlingEnabled has its own dynamicThrottlingEnabled vs deployments/create#dynamicThrottlingEnabled

If we are lucky, they get inherited 🤣

That dynamic_throttling_enabled on the cognitive account resource level is different from the actual model deployments, like you mentioned.

I tried it and it gives: DynamicThrottlingNotSupported: Thank you for your interest in Dynamic Throttling for Cognitive Services. This feature is currently not supported for the resource kind OpenAI and sku S0.

Thanks for opening this initial issue, would love to know if there is any planned update for this or I will investigate further on implementing this via azapi terraform provider.

JorisAndrade commented 6 months ago

Any news on this? From https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/dynamic-quota it is indeed dynamicThrottlingEnabled

az rest --method patch --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/deployments/{deploymentName}?2023-10-01-preview" --body '{"properties": {"dynamicThrottlingEnabled": true} }'

illgitthat commented 6 months ago

@JorisAndrade you can accomplish this via azapi provider in the meantime. Hope this helps!

resource "azapi_resource" "model_deployment" {
  type                      = Microsoft.CognitiveServices/accounts/deployments@2023-10-01-preview
  schema_validation_enabled = false
  parent_id                 = cognitive_account.account.id
  name                      = "gpt-4"
  body = jsonencode({
    sku = {
      name     = "gpt-4",
      capacity = 1
    },
    properties = {
      model = {
        format  = "OpenAI"
        name    = "gpt-4"
        version = "1106-preview"
      },
      dynamicThrottlingEnabled = true
      versionUpgradeOption     = "OnceNewDefaultVersionAvailable" # Options: NoAutoUpgrade, OnceCurrentVersionExpired, OnceNewDefaultVersionAvailable
    }
  })
  depends_on = [cognitive_account.account]
}

VickyWinner commented 6 months ago

@illgitthat thanks for sharing. will try this out. @rcskosir any ETA on the enhancement to be available as part of the provider?

rcskosir commented 6 months ago

Thanks for reaching out, unfortunately I do not have an ETA on this enhancement. Any future work via the team or the community should end up linked here via a PR.

guilhem commented 5 months ago

I opened a PR on pandora to add https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/resource-manager/Microsoft.CognitiveServices/preview/2023-10-01-preview in https://github.com/hashicorp/go-azure-sdk

https://github.com/Azure/azure-rest-api-specs/blob/82f3d9571517966992eaf97b1db73f0a821cd06b/specification/cognitiveservices/resource-manager/Microsoft.CognitiveServices/preview/2023-10-01-preview/cognitiveservices.json#L3303

After done, we will be able to import it in provider to add this feature