Open punitkshah opened 5 months ago
Thanks for your suggestion, @punitkshah! For my understanding, would a customer with provisioned deployments (PTU) not deploy this via infra as code already? My belief would be that a customer would adapt this repository to their needs, and add their current deployment in the infra code, instead of deploying both separately.
When they have added the PTU deployment in the infra as code of this repository; if the deployment is already present, it won't recreate the OpenAI model deployment.
@iMicknl - I was thinking of an approach that involves parameterizing the endpoint and model name. However, for the incorporation of existing resources, users would be required to modify the repository by updating specific details in the main.bicep and ai.bicep files, such as the model name and capacity.
In the case of most PTU deployments, which I anticipate will be common in scenarios where this proxy is utilized, the models are expected to have already been deployed with the required capacity.
While it is not a significant obstacle, based on recent deployment, it appears that users find it more convenient to specify values in just one parameter file rather than updating multiple bicep files.
In situations where a customer has acquired throughput units for their Azure OpenAI instance, the deployed models are linked to the provisioned throughput units (PTUs). In such instances, the script should offer the choice to utilize the already-existing Azure OpenAI endpoints, along with the model's name, rather than attempting to generate these resources.
This can be accomplished by parameterizing the values of the endpoints to be used and employing a conditional flag for the deployment of Azure OpenAI resources and models.