Open tjprescott opened 1 year ago
Does @server
with a parameterized host not fit the need?
https://cadlplayground.z22.web.core.windows.net/?c=aW1wb3J0ICJAdHlwZXNwZWMvaHR0cCI7Cgp1c2luZyBUeXBlU3BlYy5IdHRwOwpAc2VydmljZcYJZXIoCiAgIntob3N0fSIsxAxPcGVuQVBJIMYfxRR7CiAgICDEJDogdXJsxBN9CikKbmFtZXNwYWNlIERlbW9Txlg7CgpvcCB0ZXN0KCk6IHN0cmluZzsK
No, it doesn't. It's not just the host that differs. While the APIs are the same, the Azure and non-Azure routes are different.
Hi everyone, I am here just to provide concrete examples, hoping this helps to specify the context better of this request.
We are currently trying to support both Azure and non-Azure OpenAI backends with our SDKs. We currently are working with several teams to generate SDKs, for the following languages (so far):
For the case of .NET and JavaScript the support, under a single client SDK, for both Azure and non-Azure OpenAI services is already there. In the case of .NET, we had to override the GetUri
method and alter the request URL to point to the right server. You can see the code here. For JavaScript, the split in behaviour happens here.
Java has just started development, for non-Azure OpenAI support I have drafted a PR that somewhat showcases some of the challenges of this "2 servers/1 client" setup.
Mainly, Azure and non-Azure OpenAI:
v1
, that I know of)One proposal surfaced in a meeting today by @tjprescott , that I personally liked, was the possibility of simply splitting the 2 services into 2 clients and just add enough custom code to have a single client wrapper over the 2. I think this would address all the issues while seemingly keeping the manual maintenance cost relatively low. Although, I could be missing something. I share this here too so we don't lose visibility of all the alternatives, and so that we can also take this into account for the assessment of the feasibility of this feature.
If I understand correctly the url are something of the sort:
Could imagine this where we provide the query param as part of the server path?
import "@typespec/http";
using TypeSpec.Http;
@service
@server(
"azure.com/?api-version${api-version}",
"OpenAPI server",
{
apiVersion: "2023-01-01" | "2023-02-01",
}
)
@server(
"openapi.com",
"OpenAPI server",
{
apiVersion: "v1",
}
)
namespace DemoService;
op test(): string;
I realise I could have shared the tsp
definition where we generate the code from. Here is the service definition.
There is also @useAuth
to take into consideration. Technically, AOAI supports AAD and Token, but non-AzureOAI uses only Token auth.
An example of how routes differ:
For Azure OpenAI
{
"RequestUri": "https://openai.azure.com/openai/deployments/text-davinci-002/completions?api-version=2023-03-15-preview"
...
For non-Azure OpenAI:
{
"RequestUri": "https://api.openai.com/v1/completions",
...
These would return the same object type.
I see, so there is also the deployments
that is present in the Azure api but no OpenAI. In the spec this is defined as an operation level parameter, what do we expect the generated client to look like? Should it have that deployment?
function getCompletions(deployment: string) {
}
// or deployment is provided at the client level?
function getCompletions() {
}
@useAuth
being different does make this a little harder too.
This is one of the bigger pain points, or rather confusion points. deploymentId
while provided as part of the path for Azure OpenAI, in the non-Azure OpenAI case, it is provided under CompletionsOptions.modelName
field in the request object (this is a POST
operation). You can see the forking custom behaviour in this case in .NET over here
Note: the field is renamed to nonAzureModel
I brought up the @userAuth
it's just to illustrate some of the challenges, but it is something that certainly we don't need to address in this feature request, as far as I understand. The more pressing issue is the split between route definition for the services.
I see so to summarize the differences:
I think the real problematic one is going to be 3 here
There was another team that had a similar problem about one year ago. They had ARM and non-Azure deployment targets. But aside from server and auth, the rest is identical. They were swagger based at that time and didn't need a unified SDK at the time, and I had worked out a conditional compilation to produce separate swaggers but definitely not ideal. I would love to have a simplification as suggested via multiple @server customizations.
Having said that, coming back to this request, I have concerns about #2. We are attempting to unify/simplify the TypeSpec between the two for some benefit right now. But because of #2, we are leaving the future divergence door wide open. This could create a world of hurt when the spec starts to diverge further. If we anticipate divergence now, we should set it up as Common/Svc1/Svc2 as separate specs. This would simplify the SDK not only from the emitter perspective, but I think users will also benefit from simpler SDK APIs.
If we anticipate divergence now, we should set it up as Common/Svc1/Svc2 as separate specs.
This is very accurate. The key thing is that -- as far as I know and will confirm with the Munich team -- avoiding divergence for the common capability set between Azure- and non-Azure OpenAI, starting at the REST API layer, is an express and critical goal for both customer experience and sustainability of our technology ingestion partnership.
Put otherwise, I'm fairly certain that we have an ongoing commitment for this statement to be true: If it's something like Chat Completions that you can do using OpenAI whether it's in Azure or not, then you should be able to do it exactly* the same way once the client is configured and it shouldn't matter if it's using Azure or not"
The asterisk hedge is to account for the small set of unavoidable change that appears as a result of us being driven by the concept of deployments and unique deployment URLs when OpenAI's service isn't. That's where the URL and auth differences arise as well as where the "model or deployment name" merge comes from, but that set of drift isn't expected to grow.
Using a more concrete example, this is the basic "service switch" as shown in the .NET library's readme:
OpenAIClient client = useAzureOpenAI
? new OpenAIClient(
new Uri("https://your-azure-openai-resource.com/"),
new AzureKeyCredential("your-azure-openai-resource-api-key"))
: new OpenAIClient("your-api-key-from-platform.openai.com");
Response<Completions> response = await client.GetCompletionsAsync(
"text-davinci-003", // assumes a matching model deployment or model name
"Hello, world!");
foreach (Choice choice in response.Value.Choices)
{
Console.WriteLine(choice.Text);
}
Because getting completions is something a customer should be able to with either Azure or non-Azure OpenAI, the principle we want to retain is that the only line that needs to change (or make a programmatic selection, in this case) is the one that creates/configures the client.
I think what is very nice about this example is the exact thing we have been trying to do with typespec. The operation signature is meant to match what you want in a client and not what it should be over the wire. That the deploymentId is in the path or sent inside the body doesn't matter to the user of the SDK.
Now I think this might be a good candidate for projections where you would project the OpenAI definition to the Azure state or the regular one.
// We could imagine something like this
projection op#toAzure {
to {
if(self::parameters.get("deployementId")) {
@path(self::parameters.get("deployementId"));
}
}
}
One of the plans of projections was to serialize the projection steps so that an emitter could replicate them. Maybe this is the right solution here but would also be quite a lot of effort to implement.
OpenAI's scenario is that they need a single client that can make calls either to AzureOpenAI or "regular" OpenAI. These two services share largely the same API structure, models etc (by design) but they different by their hosts and the actual individual API routes which exceeds the capacity of the
@server
parameter. Simply packaging two clients into a single SDK has already been considered and rejected.The team is requesting some kind of mechanism to allow routes to "switch" based on the host. This is currently accomplished through manual code.