Azure / AI-in-a-Box

MIT License
461 stars 170 forks source link

[BUG] Issue with bot-in-a-box Gen-AI/Assistant accelerator #98

Open PatLac04 opened 2 months ago

PatLac04 commented 2 months ago

Solution Accelerators This repository contains multiple solution accelerators. Please tell us which ones are involved in your report. (Replace the space in between square brackets with an x)

Describe the bug Tried deploying the solution multiple times following the instructions but the bot doesn't work. I always get the same error message, whatever the prompt I enter.

bot-error

To Reproduce Steps to reproduce the behavior:

  1. Deploy the solution according to the instructions in the repo. Make sure to pick Canada-East region since that is the only region where GPT-4 is available.
  2. Go to the Bot service and try "Test WebChat"
  3. Enter something in the prompt
  4. See error

Expected behavior A valid answer.

MarcoABCardoso commented 2 months ago

Hey @PatLac04 ! Thank you for filing an issue.

Would you kindly let us know if the assistant was created in your Azure OpenAI instance? There is a post deploy hook supposed to create and set it in your application.

Thank you!

MarcoABCardoso commented 2 months ago

Hello again @PatLac04 !

I believe I understand the issue - Assistants are not yet available in Canada East. This will cause the Assistant creation to fail, followed by a null value in the ASSISTANT_ID configuration variable.

As of today, the available regions are East US 2, Sweden Central and Australia East.

I'll use this issue to track the need to call out this deployment error when Assistants are not available in the selected region.

Thanks again for bringing this to our attention!

PatLac04 commented 2 months ago

Hi @MarcoABCardoso,

Exactly what I wanted to tell you. The deployment forces us to use GPT-4 which is only available in a few regions. I used Canada East. I'll try with Sweden since GPT-4 and Assistants are available there

thomassantosh commented 2 months ago

Just an observation on this thread.... Once you establish the regions that the Assistants API is available, any existing deployments in those regions may "eat" into the available PAYGO capacity you have and fail the provisioning process . So ensuring you scale existing deployments down (both for gpt-4 and text-embedding) to allow for some buffer to allow this provisioning process to complete is needed.

MarcoABCardoso commented 2 months ago

Great points @thomassantosh - we may want to add the capacity as an input, right now it's hardcoded as 10. I've started a PR on these issues, will add to it accordingly.