microsoft / typespec

https://typespec.io/
MIT License
4.51k stars 216 forks source link

Feature request: Multiple service metadata decorators in a single client #1925

Open tjprescott opened 1 year ago

tjprescott commented 1 year ago

OpenAI's scenario is that they need a single client that can make calls either to AzureOpenAI or "regular" OpenAI. These two services share largely the same API structure, models etc (by design) but they different by their hosts and the actual individual API routes which exceeds the capacity of the @server parameter. Simply packaging two clients into a single SDK has already been considered and rejected.

The team is requesting some kind of mechanism to allow routes to "switch" based on the host. This is currently accomplished through manual code.

timotheeguerin commented 1 year ago

Does @server with a parameterized host not fit the need? https://cadlplayground.z22.web.core.windows.net/?c=aW1wb3J0ICJAdHlwZXNwZWMvaHR0cCI7Cgp1c2luZyBUeXBlU3BlYy5IdHRwOwpAc2VydmljZcYJZXIoCiAgIntob3N0fSIsxAxPcGVuQVBJIMYfxRR7CiAgICDEJDogdXJsxBN9CikKbmFtZXNwYWNlIERlbW9Txlg7CgpvcCB0ZXN0KCk6IHN0cmluZzsK

tjprescott commented 1 year ago

No, it doesn't. It's not just the host that differs. While the APIs are the same, the Azure and non-Azure routes are different.

jpalvarezl commented 1 year ago

Hi everyone, I am here just to provide concrete examples, hoping this helps to specify the context better of this request.

We are currently trying to support both Azure and non-Azure OpenAI backends with our SDKs. We currently are working with several teams to generate SDKs, for the following languages (so far):

For the case of .NET and JavaScript the support, under a single client SDK, for both Azure and non-Azure OpenAI services is already there. In the case of .NET, we had to override the GetUri method and alter the request URL to point to the right server. You can see the code here. For JavaScript, the split in behaviour happens here.

Java has just started development, for non-Azure OpenAI support I have drafted a PR that somewhat showcases some of the challenges of this "2 servers/1 client" setup.

Mainly, Azure and non-Azure OpenAI:

One proposal surfaced in a meeting today by @tjprescott , that I personally liked, was the possibility of simply splitting the 2 services into 2 clients and just add enough custom code to have a single client wrapper over the 2. I think this would address all the issues while seemingly keeping the manual maintenance cost relatively low. Although, I could be missing something. I share this here too so we don't lose visibility of all the alternatives, and so that we can also take this into account for the assessment of the feasibility of this feature.

timotheeguerin commented 1 year ago

If I understand correctly the url are something of the sort:

Could imagine this where we provide the query param as part of the server path?

import "@typespec/http";

using TypeSpec.Http;
@service
@server(
  "azure.com/?api-version${api-version}",
  "OpenAPI server",
  {
    apiVersion: "2023-01-01" | "2023-02-01",
  }
)
@server(
  "openapi.com",
  "OpenAPI server",
  {
    apiVersion: "v1",
  }
)
namespace DemoService;

op test(): string;

Playground

jpalvarezl commented 1 year ago

I realise I could have shared the tsp definition where we generate the code from. Here is the service definition.

There is also @useAuth to take into consideration. Technically, AOAI supports AAD and Token, but non-AzureOAI uses only Token auth.

An example of how routes differ:

{
      "RequestUri": "https://api.openai.com/v1/completions",
...

These would return the same object type.

timotheeguerin commented 1 year ago

I see, so there is also the deployments that is present in the Azure api but no OpenAI. In the spec this is defined as an operation level parameter, what do we expect the generated client to look like? Should it have that deployment?

function getCompletions(deployment: string) {

}

// or deployment is provided at the client level?

function getCompletions() {

}

@useAuth being different does make this a little harder too.

jpalvarezl commented 1 year ago

This is one of the bigger pain points, or rather confusion points. deploymentId while provided as part of the path for Azure OpenAI, in the non-Azure OpenAI case, it is provided under CompletionsOptions.modelName field in the request object (this is a POST operation). You can see the forking custom behaviour in this case in .NET over here

Note: the field is renamed to nonAzureModel

jpalvarezl commented 1 year ago

I brought up the @userAuth it's just to illustrate some of the challenges, but it is something that certainly we don't need to address in this feature request, as far as I understand. The more pressing issue is the split between route definition for the services.

timotheeguerin commented 1 year ago

I see so to summarize the differences:

  1. api version is in path vs query
  2. api version have different value
  3. deploymentId provided in path or in body of request
  4. authentication is different

I think the real problematic one is going to be 3 here

allenjzhang commented 1 year ago

There was another team that had a similar problem about one year ago. They had ARM and non-Azure deployment targets. But aside from server and auth, the rest is identical. They were swagger based at that time and didn't need a unified SDK at the time, and I had worked out a conditional compilation to produce separate swaggers but definitely not ideal. I would love to have a simplification as suggested via multiple @server customizations.

Having said that, coming back to this request, I have concerns about #2. We are attempting to unify/simplify the TypeSpec between the two for some benefit right now. But because of #2, we are leaving the future divergence door wide open. This could create a world of hurt when the spec starts to diverge further. If we anticipate divergence now, we should set it up as Common/Svc1/Svc2 as separate specs. This would simplify the SDK not only from the emitter perspective, but I think users will also benefit from simpler SDK APIs.

trrwilson commented 1 year ago

If we anticipate divergence now, we should set it up as Common/Svc1/Svc2 as separate specs.

This is very accurate. The key thing is that -- as far as I know and will confirm with the Munich team -- avoiding divergence for the common capability set between Azure- and non-Azure OpenAI, starting at the REST API layer, is an express and critical goal for both customer experience and sustainability of our technology ingestion partnership.

Put otherwise, I'm fairly certain that we have an ongoing commitment for this statement to be true: If it's something like Chat Completions that you can do using OpenAI whether it's in Azure or not, then you should be able to do it exactly* the same way once the client is configured and it shouldn't matter if it's using Azure or not"

The asterisk hedge is to account for the small set of unavoidable change that appears as a result of us being driven by the concept of deployments and unique deployment URLs when OpenAI's service isn't. That's where the URL and auth differences arise as well as where the "model or deployment name" merge comes from, but that set of drift isn't expected to grow.

Using a more concrete example, this is the basic "service switch" as shown in the .NET library's readme:

OpenAIClient client = useAzureOpenAI
    ? new OpenAIClient(
        new Uri("https://your-azure-openai-resource.com/"),
        new AzureKeyCredential("your-azure-openai-resource-api-key"))
    : new OpenAIClient("your-api-key-from-platform.openai.com");

Response<Completions> response = await client.GetCompletionsAsync(
    "text-davinci-003", // assumes a matching model deployment or model name
    "Hello, world!");

foreach (Choice choice in response.Value.Choices)
{
    Console.WriteLine(choice.Text);
}

Because getting completions is something a customer should be able to with either Azure or non-Azure OpenAI, the principle we want to retain is that the only line that needs to change (or make a programmatic selection, in this case) is the one that creates/configures the client.

timotheeguerin commented 1 year ago

I think what is very nice about this example is the exact thing we have been trying to do with typespec. The operation signature is meant to match what you want in a client and not what it should be over the wire. That the deploymentId is in the path or sent inside the body doesn't matter to the user of the SDK.

Now I think this might be a good candidate for projections where you would project the OpenAI definition to the Azure state or the regular one.

// We could imagine something like this
projection op#toAzure {
  to {
    if(self::parameters.get("deployementId")) {
      @path(self::parameters.get("deployementId"));
    }
  }
}

One of the plans of projections was to serialize the projection steps so that an emitter could replicate them. Maybe this is the right solution here but would also be quite a lot of effort to implement.