Open brucedkyle opened 3 days ago
Thanks for reporting this!
We recently changed the model from gpt-4
to gpt-4o-mini
, the restriction probably apply only to the new model.
I'll ask around to understand which models are restricted with Trial subscription and update the docs. Meanwhile, you can try a different model, even an "older" gpt-35-turbo
works well for this workshop.
EDIT: I just found out that the 1TPM restriction applies to all models when using free trial, that means that the best way to fix the issue for now it to update the capacity to 1
here: https://github.com/Azure-Samples/azure-openai-rag-workshop/blob/main/infra/main.bicep#L51
I'll check if we can have a better workaround and update the docs.
Many thanks. I was reading through all the docs last night down one rabbit hole after another trying to figure out how to set that value and how to reset my Bicep. This is valuable to know when it is time for production too.
So your suggestion is PERFECT. To do the dev/test in the lab, this should work perfectly.
I'll try it in a few hours.
My thanks.
I may be getting ahead of myself, but should I be using the "ProvisionedManaged" SKU to also help manage my quota?
a. Not sure if that fixes the problem for the trial. b. Should that be my choice even when I go into the paid subscriptions?
If that's discussed in the lab, I'll be happy to learn more.
I don't recommend using the ProvisionedManaged SKU (either for paid or trial) unless you have a consistent high throughput workload, as you get billed by the hour instead or per API usage with Standard/GlobalStandard SKU.
The best workaround would be to use GitHub Models if you have access, as you get 8K TPM and a simpler setup (you can use a similar setup as Ollama, if you're using the Qdrant variant of the workshop). We plan to add these as an alternative in the docs, this should be up in the coming weeks.
I will check out GitHub Models and the Qdrant variant of the workshop. Both great suggestions.
I just have to say I so appreciate your support. And I hope my suggestions are helpful.
This is one of the finest labs I've worked on. Great details and excellent starting code from what I can tell.
This topic is huge. And I am being asked about RAG in every AI/ML interview. Being up on Azure is HUGE to me. And this is a great starting point.
Thank you for your help. I am up with the 1TPM.
In "docs/sections/09-azure.md", the text suggests that a trial or free version of Azure is sufficient to deploy this workshop. However, it does not seem to deploy correctly. When I got to the point later in the workshop azd provision fails:
The support tool does not allow changes to quota. When you follow the support instructions to request an update, the link takes you to a form. The form to request change in quota is "closed". I tried it for US East 2. Not sure of other locations/configurations, etc.
Neither the support text nor CoPilot in that location have suggestions on how to solve these issues.