Azure-Samples / aks-store-demo

Sample microservices app for AKS demos, tutorials, and experiments
MIT License
136 stars 216 forks source link

[BUG] Inaccessible Open AI Model #116

Closed nathaniel-msft closed 5 months ago

nathaniel-msft commented 5 months ago

Describe the bug

Not all models on Open AI are available for usage/consumption. As a result, some subscriptions aren't able to use the gpt-35-turbo model and it results in a hung cluster. Instead of a cluster stuck in a hung status, have it still run, just without the OpenAI Service or try a different model.

To Reproduce Steps to reproduce the behavior:

  1. Run azd up
  2. Select a subscription without access to gpt-35-turbo
  3. Check Azure Portal for AKS Cluster in Hung/Failed state.
  4. See error message

This operation requires 30 new capacity in quota Tokens Per Minute (thousands) - GPT-35-Turbo, which is bigger than the current available capacity 0. The current quota usage is 300 and the quota limit is 300 for quota Tokens Per Minute (thousands) - GPT-35-Turbo. (Code: InsufficientQuota)

Expected behavior A clear and concise description of what you expected to happen.

  1. Runs and doesn't hang.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

pauldotyu commented 5 months ago

Thanks for raising this @nathaniel-msft.

This seems to be subscription related and not due to any bug in the infra-as-code templates. Users should ensure they have enough quota in the desired region before running the azd up command.

This Azure CLI command can be used to check tokens per minute quota for a particular region:

az cognitiveservices usage list \
  --location eastus2 \
  --query "[].{name: name.value, currentValue:currentValue, limit: limit}" \
  -o table

Currently we are defaulting the model capacity to 30 TPM (see variables.tf) so the difference between CurrentValue and Limit in the CLI query above should be equal or greater than that.

Can you confirm the usage on your end?

nathaniel-msft commented 5 months ago

Sounds good, I'll close this issue. There were concerns about setting a default to gpt-35-turbo since it's not available on all subscriptions, but there's no better alternative currently as the -16k is an upgraded version.