Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.14k stars 4.18k forks source link

Error deploying open AI: Specified capacity of account deployment is bigger than available capacity for UsageName Tokens Per Minute (Thousands) ext-Davinci-003 #313

Closed Codearella closed 1 year ago

Codearella commented 1 year ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I ran azd up after initializing the project, however the openai deployment failed and I got the error message described. The other components deployed without errors. I'm not sure exactly what's causing the error so I'm not sure how to reproduce it.

Any log messages given by the failure

The specified capacity '60' of account deployment is bigger than available capacity '0' for UsageName 'Tokens Per Minute (thousands) - Text-Davinci-003'.

Expected/desired behavior

Expected behavior is for openai to be deployed without errors.

OS and Version?

Windows 10 Enterprise (OS Build 19044.2965)

azd version?

azd version 1.0.1 (commit e0cd1aca716fa5d08704beade7dcc734fe68f5f1)

Mention any other details that might be useful

pamelafox commented 1 year ago

I ran into this as well. It turns out that Azure OpenAI changed their quota system last week, and hasn't yet updated the Bicep schema (the infrastructure-as-code language used for this repo and others) to reflect the new quota system. For now, I've been manually resetting the quota of my deployments before each "azd up" by going to the Azure OpenAI studio, selecting the Quotas tab, and sliding the TPM to 1K for each of them.

paprocki-r commented 1 year ago

same here https://github.com/Azure-Samples/azure-search-openai-demo/issues/307

fakoe commented 1 year ago

I'm completely new to Azure.

How do you manually reset the quota of your deployment, when you havent deployed it? In my Azure OpenAI studio, there is no deployment visible, before and after deploying this demo with "azd up".

I only see the deployment in portal.azure.com.

As I already mentioned, I'm new to Azure / Azure OpenAI and I really don't understand where to look. I can set an environment-name for the deployment, which creates a ressource-group, but that's all I understand. I can also see the quota TPMs per region and subscription, which currently sits at 120/120. Since I have no deployment running in East US and it still sits at 120/120, I wonder how I can determine which deployment currently takes up all quotas? For West Europe I can see the deployments, taking up the TPMs, but for East US I can't. Can it be that my account doesn't have sufficent permissions? My company created it for tests.

paprocki-r commented 1 year ago

I'm completely new to Azure.

How do you manually reset the quota of your deployment, when you havent deployed it? In my Azure OpenAI studio, there is no deployment visible, before and after deploying this demo with "azd up".

I only see the deployment in portal.azure.com.

As I already mentioned, I'm new to Azure / Azure OpenAI and I really don't understand where to look. I can set an environment-name for the deployment, which creates a ressource-group, but that's all I understand. I can also see the quota TPMs per region and subscription, which currently sits at 120/120. Since I have no deployment running in East US and it still sits at 120/120, I wonder how I can determine which deployment currently takes up all quotas? For West Europe I can see the deployments, taking up the TPMs, but for East US I can't. Can it be that my account doesn't have sufficent permissions? My company created it for tests.

@fakoe, temporary solution is to go to Azure AI studio, Deployments, and for each Edit deployment, Advanced, set "Tokens per Minute Rate Limit" from 120 to e.g. 2. Then deploy your accelerator with "azd up". Also open the project it in VSCode, it will suggest you cool extensions

fakoe commented 1 year ago

@fakoe, temporary solution is to go to Azure AI studio, Deployments, and for each Edit deployment, Advanced, set "Tokens per Minute Rate Limit" from 120 to e.g. 2. Then deploy your accelerator with "azd up". Also open the project it in VSCode, it will suggest you cool extensions

I understand where to set it, but the problem is, that I don't see the deployment. I followed all installation steps in this demo and that's it. They don't tell you how to setup a deployment that you can see in Azure AI studio. I got an account from my company, where they told me, that it has a ChatGPT subscription. I deployed a demo without problems last week. Now, I wanted to create another and I run into the described error. If I lookup the currently available deployments in Azure AI studio, I only see a deployment, that's not mine, nor I ever referenced or used it. The demo references using existing sources. I never did that, I just ran azd up, selected the subscription and the correct region and that's it. How can I configure or edit a deployment, if I haven't created it with azd up? Do I have to create a new deployment for, let's say text-davinci-003 (which seems to be the default) and explicitly reference it through AZURE_OPENAI_CHATGPT_DEPLOYMENT or AZURE_OPENAI_GPT_DEPLOYMENT with azd up? Which makes me wonder how the demo has worked without that so far (until last week). To my understanding, and all I can track from that demo, when I run azd up is the general deployment of all sources. Only the openapi source fails there with the described error from above. But when I want to check this ressource, it says it's not even created. It just gives me that error message and that's it. I don't know how I should give that particular source TPM, when it's not even created due to insufficient TPM :/

pamelafox commented 1 year ago

It looks like https://github.com/Azure-Samples/azure-search-openai-demo/pull/322 might address this, I'll be testing it out shortly.

pablocastro commented 1 year ago

Just merged #322 which addresses this. Closing this, but feel free to reopen if some scenario comes up where the change is not effective.

OrionSehn commented 1 year ago

This Problem Persists

I Deployed this about 24 hours ago and it was working with your previous deployment files, but now for some reason it wont work, and I still get the same error with your latest changes.

This what I ran into this morning: image

I pulled your latest build, and this was the error in the build process: image Very similiar error.

I'm Deploying to US East, but I'm closest to US West, I don't know if if that matters.

I would also like to see a bicep deployment of this at the target scope of a resource group. Often larger companies don't actually deploy things at the subscription level, and reserve those permissions for administration. It would make a lot more sense for these to target a specific resource group (Although I'm aware support for the azure developer CLI at a resource group scope is in alpha).

Targeting a subscription level deployment, limits the swift implementation of this example to individuals, small startups, and very high level system administrators at larger companies.

@pablocastro

edit: typo

benleroux commented 1 year ago

open ai service is "soft deleted" if you have one or a few you've deleted you may need to purge them to release quota

Bv-Lucas commented 1 year ago

I'm facing the same issue with terraform azurerm provider.

image

Latest update is 8 days ago, and the doc doesn't reference the new attribute "capacity" that #322 added to bicep

Any chance to get an update to azurerm provider for people using terraform ?

gregorcvek commented 1 year ago

open ai service is "soft deleted" if you have one or a few you've deleted you may need to purge them to release quota

Purge solved it for me. tnx

OrionSehn commented 1 year ago

open ai service is "soft deleted" if you have one or a few you've deleted you may need to purge them to release quota

Thank you very much, this resolved my issue as well. Annoying that the error message was exactly the same as the issue resolved in this thread.

robvet commented 1 year ago

Thanks for tip. That worked!

gregorcvek commented 1 year ago

Hi,

Thank you for your response. Purge worked.

kr


From: Orion Sehn @.> Sent: Thursday, June 22, 2023 5:47 PM To: Azure-Samples/azure-search-openai-demo @.> Cc: Gregor Cvek @.>; Comment @.> Subject: Re: [Azure-Samples/azure-search-openai-demo] Error deploying open AI: Specified capacity of account deployment is bigger than available capacity for UsageName Tokens Per Minute (Thousands) ext-Davinci-003 (Issue #313)

Caution: This email is from an EXTERNAL SENDER and may be malicious. Do NOT Click any links or Open any attachments if you were not expecting them.

open ai service is "soft deleted" if you have one or a few you've deleted you may need to purge them to release quota

Thank you very much, this resolved my issue as well. Annoying that the error message was exactly the same as the issue resolved in this thread.

— Reply to this email directly, view it on GitHubhttps://github.com/Azure-Samples/azure-search-openai-demo/issues/313#issuecomment-1602881414, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BAXKQ5MRPFPSVVEYZZV5D7LXMRSHTANCNFSM6AAAAAAZGXN2EQ. You are receiving this because you commented.Message ID: @.***>

wiswis15 commented 1 year ago

the issue is still here. But purging fixes the issue. How to purge , see this : https://learn.microsoft.com/en-us/azure/cognitive-services/manage-resources?tabs=azure-portal

fakoe commented 1 year ago

the issue is still here. But purging fixes the issue. How to purge , see this : https://learn.microsoft.com/en-us/azure/cognitive-services/manage-resources?tabs=azure-portal

THIS is what I searched for! If you stop the deployment, while its running, or if you have a successful deployment and then re-deploy and something goes wrong, I couldn't find the previously built ressources and with "azd down --purge" it just was able to capture the current deployment and not the previous one. Therefore the old deployments, even if they weren't active, blocked the available quotas.

To me, the problem with the quotas doesn't seem to be a bug of the demo, but a general annoyance of how Azure and Cognitive-Search keep allocating resources to soft-deleted deployments.

It's my first time using Azure and the Cognitive-Service and I just didn't have the knowledge to understand what was happening.

Thanks for the clearification!

Bv-Lucas commented 1 year ago

Purging doesn't fix the issue at all. It frees some quota if you have soft deleted resources, which in turn makes deployment possible because you suddenly have quota available.

This issue is about being able to define a "capacity" when using bicep, which was implemented a few days ago and allows bicep users to deploy openAI service with a specific capacity instead of using the "default Microsoft policy" defining quotas. It has very little to do with soft deleted openAI services.

wiswis15 commented 1 year ago

well, for me, using the fix was not enough.

fakoe commented 1 year ago

Purging doesn't fix the issue at all. It frees some quota if you have soft deleted resources, which in turn makes deployment possible because you suddenly have quota available.

This issue is about being able to define a "capacity" when using bicep, which was implemented a few days ago and allows bicep users to deploy openAI service with a specific capacity instead of using the "default Microsoft policy" defining quotas. It has very little to do with soft deleted openAI services.

If soft-deleted resources take the maximum amount of quotas, you still won't be able to re-deploy the demo, even with definable tpm.

Branislava commented 1 year ago

Reducing quota actually helped, thanks a lot.

wiswis15 commented 1 year ago

@Branislava can you please help how to reduce the quota ?

wiswis15 commented 1 year ago

im having this same error: Deployment Error Details: InvalidTemplateDeployment: The template deployment 'openai' is not valid according to the validation procedure. The tracking id is '14df4aaf-343b-402a-a57f-ff468cfffea1'. See inner errors for details. InsufficientQuota: The specified capacity '1' of account deployment is bigger than available capacity '0' for UsageName 'Tokens Per Minute (thousands) - Text-Davinci-003'.

There are no openai services to purge!

yeggan commented 1 year ago

@pablocastro @pamelafox It seems #322 doesn't resolve the issue. I did run a few minutes ago and received below error.

ERROR: deployment failed: failing invoking action 'provision', error deploying infrastructure: deploying to subscription:

Deployment Error Details: InvalidTemplateDeployment: The template deployment 'openai' is not valid according to the validation procedure. The tracking id is .......'. See inner errors for details. InsufficientQuota: The specified capacity '1' of account deployment is bigger than available capacity '0' for UsageName 'Tokens Per Minute (thousands) - Text-Davinci-003'.

Sangeeth-fb commented 1 year ago

azd up updates the capacity to 120k, so regardless if updating the deployment quota to 1k prior to azd up, deployment fails with error - InsufficientQuota: The specified capacity '120' of account deployment is bigger than available capacity '99' for UsageName 'Tokens Per Minute (thousands) - GPT-35-Turbo'.

TroyHostetter commented 1 year ago

Yea .. I am getting the same issue. There was nothing to purge using the Cog Serv Anomaly Detector.

I have been having issues since the suite of deployment models was greatly reduced within the past week or so. I've not been successful at implementing any of the 3 new ones. GPT-35-Turbo, GPT-35-Turbo-16k, and Text-Embedding-Ada-002.

{
  "code": "InvalidTemplateDeployment",
  "message": "The template deployment 'openai' is not valid according to the validation procedure. The tracking id is '3d1c4208-e1f1-4c96-98cd-8722a488465d'. See inner errors for details.",
  "details": [
    {
      "code": "InsufficientQuota",
      "message": "The specified capacity '120' of account deployment is bigger than available capacity '118' for UsageName 'Tokens Per Minute (thousands) - GPT-35-Turbo'."
    }
  ]
}
pamelafox commented 1 year ago

@TroyHostetter Did you update your Bicep to change the deployment to 30? That message indicates its still the default of 120, I think. Here's what it looks like now: https://github.com/Azure-Samples/azure-search-openai-demo/blob/0cc155b2c8ce788be7b0916f392d8f531a71f6d9/infra/main.bicep#L145 You could diff your main.bicep with the current main.bicep using a tool like diffchecker.com

OrionSehn commented 1 year ago

I can confirm that the update @pamelafox mentioned did resolve that particular issue, and it does deploy correctly for me.

Somewhat unrelated but is there any chance we'll get a deployment pipeline at a resource-group target scope instead of the subscription level? Many large companies don't give out subscription level permissions across the company, and a resource-group targetScope would help get this deployed faster. (I know this azd feature is in alpha).

edit - grammar

pamelafox commented 1 year ago

@OrionSehn Please subscribe to this issue in the azd repo, as that seems to be what you're looking for: https://github.com/Azure/azure-dev/issues/337 Thanks for the feedback!

TroyHostetter commented 1 year ago

@TroyHostetter Did you update your Bicep to change the deployment to 30? That message indicates its still the default of 120, I think. Here's what it looks like now:

https://github.com/Azure-Samples/azure-search-openai-demo/blob/0cc155b2c8ce788be7b0916f392d8f531a71f6d9/infra/main.bicep#L145

You could diff your main.bicep with the current main.bicep using a tool like diffchecker.com

That did the trick, with a few other "diff" changes. Thanks for the help! Last question, is there a place for general questions about the openai-demo repo?

pamelafox commented 1 year ago

We don't have a "Discussions" tab enabled here, so you can use "Issues" for questions as well.

Sangeeth-fb commented 1 year ago

@pamelafox , It works for me as well after making changes in main and cognitiveservices bicep. my changes are way ahead, will do a merge this weekend

ananthuraj-cresen commented 1 year ago

I ran into this as well. It turns out that Azure OpenAI changed their quota system last week, and hasn't yet updated the Bicep schema (the infrastructure-as-code language used for this repo and others) to reflect the new quota system. For now, I've been manually resetting the quota of my deployments before each "azd up" by going to the Azure OpenAI studio, selecting the Quotas tab, and sliding the TPM to 1K for each of them.

@pamelafox I have faced the same issue and the deployment worked when the token rates were manually set to 1k from the portal side. Doesnt this increase the deployment time drastically ? Is there any other way to improve the token rate ? Like using a number higher than 1k that would be faster and wont cause issues? Is there any way of changing this parameter in the bicep file of the repo. I tried changing the param chatGptDeploymentCapacity int = 30 to 10, in the main.bicep file and other numbers but it hasnt made any difference.

zlr-raja commented 1 year ago

MicrosoftTeams-image

follow this to resolve the issue

yeggan commented 1 year ago

Thanks . I knew this , also sometimes you need to do via cmd as well . The issue was not resolved by this.

On Wed, 2 Aug 2023 at 10:58 pm, Rajashekhar A @.***> wrote:

[image: MicrosoftTeams-image] https://user-images.githubusercontent.com/29478647/257819867-423e8526-ed89-483a-9842-9f51437bec53.png

follow this to resolve the issue

— Reply to this email directly, view it on GitHub https://github.com/Azure-Samples/azure-search-openai-demo/issues/313#issuecomment-1662167558, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZUBBTPNSK44OLMVNBX6IVTXTJFIFANCNFSM6AAAAAAZGXN2EQ . You are receiving this because you commented.Message ID: @.***>

pamelafox commented 1 year ago

If anyone is still experiencing this and think its an error, here is a related issue: https://github.com/Azure/bicep-types-az/issues/1660

khalid-ibnelbachyr commented 1 year ago

Purging Azure OpenAI resources fixes the issue for me ;)