This repo is an exploration into creating a simulated API implementation for Azure OpenAI (AOAI).
WARNING: This is a work in progress!
The Azure OpenAI API Simulator is a tool that allows you to easily deploy endpoints that simulate the OpenAI API. A common use-case for the simulator is to test the behaviour your application under load, without making calls to the live OpenAI API endpoints.
Let's illustrate this with an example...
Let's assume that you have built a chatbot that uses the OpenAI API to generate responses to user queries. Before your chatbot becomes popular, you want to ensure that it can handle a large number of users. One of the factors that will impact whether your chatbot can gracefully handle such load will be the way that your chatbot handles calls to OpenAI. However, when load testing your chatbot there are a number of reasons why you might not want to call the OpenAI API directly:
In fact, when considering Rate Limits and Latency, these might be things that you'd like to control. You may also want to inject latency, or inject rate limit issues, so that you can test how your chatbot deals with these issues.
This is where the Azure OpenAI API Simulator plays its part!
By using the Azure OpenAI API Simulator, instead of the live OpenAI API, you can reduce the cost of running load tests against the OpenAI API and ensure that your application behaves as expected under different conditions.
The Azure OpenAI API Simulator presents the same interface as the live OpenAI API, allowing you to easily switch between the two, and then gives you full control over the responses that are returned.
The Azure OpenAI API Simulator has two approaches to simulating API responses:
When run in Generator mode the Azure OpenAI API Simulator will create responses to requests on the fly. This mode is useful for load testing scenarios where it would be costly/impractical to record the full set of responses, or where the content of the response is not critical to the load testing.
With record/replay, the Azure OpenAI API Simulator is set up to act as a proxy between your application and Azure OpenAI. The Azure OpenAI API Simulator will then record requests that are sent to it along with the corresponding response from OpenAI API.
Once a set of recordings have been made, the Azure OpenAI API Simulator can then be run in replay mode where it uses these saved responses without forwarding anything to the OpenAI API.
Recordings are stored in YAML files which can be edited if you want to customise the responses.
The Azure OpenAI API Simulator has been used in the following scenarios:
The Azure OpenAI API Simulator is not a replacement for testing against the real Azure OpenAI API, but it can be a useful tool in your testing toolbox.
The document Running and Deploying the Azure OpenAI API Simulator includes instructions on running the Azure OpenAI API Simulator locally, packaging and deploying it in a Docker container, and also deploying the Azure OpenAI API Simulator to Azure Container Apps.
The behaviour of the Azure OpenAI API Simulator is controlled via a range of Azure OpenAI API Simulator Configuration Options.
There are also a number of Azure OpenAI API Simulator Extension points that allow you to customise the behaviour of the Azure OpenAI API Simulator. Extensions can be used to modify the request/response, add latency, or even generate responses.
Finally, if you're looking to contribute to the Azure OpenAI API Simulator you should refer to the Azure OpenAI API Simulator Development Guide.
For a list of tagged versions and changes, see the CHANGELOG.md file.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.