Azure / terraform-azurerm-openai

Terraform module for deploying Azure OpenAI Service.
MIT License
41 stars 29 forks source link

model_version 0125 is not available #94

Closed kb12abz closed 3 weeks ago

kb12abz commented 3 weeks ago

Is there an existing issue for this?

Greenfield/Brownfield provisioning

greenfield

Terraform Version

1.8.5

Module Version

0.1.3

AzureRM Provider Version

~> 3.0, < 4.0

Affected Resource(s)/Data Source(s)

openai

Terraform Configuration Files

module "openai" {
  account_name                  = var.account_name
  source                        = "Azure/openai/azurerm"
  version                       = "0.1.3"
  resource_group_name           = azurerm_resource_group.cai_resource_group.name
  location                      = var.location
  public_network_access_enabled = true
  private_endpoint = {
    "${var.account_name}-${var.environment}" = {
      name                            = "${var.account_name}-${var.environment}"
      private_service_connection_name = "${var.account_name}-${var.environment}_connection"
      subnet_name                     = "default"
      vnet_name                       = data.azurerm_virtual_network.cai-vnet.name
      vnet_rg_name                    = data.azurerm_virtual_network.cai-vnet.resource_group_name
      private_dns_entry_enabled       = true
      dns_zone_virtual_network_link   = "dns_zone_link_openai"
      is_manual_connection            = false
    }
  }
  deployment = {
    "gpt-3.5-turbo" = {
      name          = var.gtp3_5_name
      model_format  = "OpenAI"
      model_name    = "gpt-35-turbo"
      model_version = "0301"
      scale_type    = var.scale_type
      capacity      = var.gtp3_5_capacity
    },
    "gpt-4-0" = {
      name          = var.gtp4o_name
      model_format  = "OpenAI"
      model_name    = "gpt-4o"
      model_version = "2024-05-13"
      scale_type    = var.scale_type
      capacity      = var.gtp4o_capacity
    }
  }
  depends_on = [
    azurerm_resource_group.cai_resource_group
  ]
}

tfvars variables values

client_id           = ""
subscription_id     = ""
account_name        = "tfe-np-open-ai-east-us"
gtp3_5_name         = "tfe-np-gpt-3-5-turbo"
gtp4o_name          = "tfe-np-gpt-4o"
gtp3_5_capacity     = 240
gtp4o_capacity      = 120
environment         = "np"
vnet_name           = "test-np"
vnet_resource_group = "test-vnet"

Debug Output/Panic Output

Believe issue is due to an older version of azurerm being used in the module

https://github.com/Azure/terraform-azurerm-openai/blob/main/providers.tf

0125 was added in Feb 24 - module was last published Jan 24

Expected Behaviour

Expect to be able to use 0125 model_version

Actual Behaviour

Terraform throws an error that 0125 is not present. 0301 as in the code above works correctly and deploys.

Steps to Reproduce

terraform apply

Important Factoids

No response

References

No response

zioproto commented 3 weeks ago

hello @kb12abz ,

I understand you want to use GPT-4 Turbo. You should use the model version turbo-2024-04-09

Please look at this documentation: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-4-turbo

kb12abz commented 3 weeks ago

@zioproto its gpt-3.5-turbo that we would like to use 0125 for initially. will look at the above for GPT-4 Turbo - thanks

zioproto commented 3 weeks ago

@kb12abz I understand now which model you are looking for: Screenshot 2024-08-23 at 15 21 29

Let me try to reproduce

kb12abz commented 3 weeks ago

Thanks apologies for not making that clearer.

zioproto commented 3 weeks ago

@kb12abz at this table: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#standard-deployment-model-availability

you can check in which regions the model version you are looking for is available:

can you confirm you are using one of those regions ? Screenshot 2024-08-23 at 15 36 55

kb12abz commented 3 weeks ago

ah looks like this is where our issue has come in. We are deploying into eastus. Which is why it wont be working. thanks for your help

zioproto commented 3 weeks ago

If you deploy to eastus your terraform apply should fail with this error:

│ Deployment Name: "gpt-35-turbo"): performing CreateOrUpdate: unexpected status 400 (400 Bad Request) with error:
│ InvalidResourceProperties: The specified SKU 'Standard' for model 'gpt-35-turbo 0125' is not supported in this region 'eastus'.