Azure / azure-functions-durable-js

JavaScript library for using the Durable Functions bindings
https://www.npmjs.com/package/durable-functions
MIT License
128 stars 47 forks source link

Improve the experience of getting the Durable client in the new prog model #414

Closed hossam-nasr closed 11 months ago

hossam-nasr commented 1 year ago

This issue is based off discussions here, here, and here.

In the current alpha release of the SDK (3.0.0-alpha.1) to support durable functions in the new programming model, this is the best experience for a user to register an HTTP-triggered durable client starter function:

const clientInput = df.input.durableClient();

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client = df.getClient(context, clientInput);
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput],
    handler: httpStart,
});

As seen above, this requires the user to explicitly declare a durable client input binding, specify it in the extraInputs parameter when registering the function, and pass it to the df.getClient() method. This is clearly not an ideal user experience, and arguably a downgrade from the old model experience.

One suggestion/alternative would be to have the SDK provide an API that registers a "durable client handler," which takes in 3 arguments instead of the usual two (the trigger input, the context, and the durable client), while the SDK handles the input binding behind the scenes and provides the durable client directly to the handler. Such user code could look like this:

const clientHandler: DurableClientHandler = async (context: InvocationContext, request: HttpRequest, client) => {
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

df.app.client('durableOrchestrationStart1', {
    trigger: trigger.http({ route: 'orchestrators/{orchestratorName}' }),
    return: output.http({}),
    handler: clientHandler,
});

However, per the discussion here, this "appears to come at a tradeoff with the trigger (you lose all the intellisense and defaults of doing app.http)".

Another alternative would be to provide some equivalent to app.http(), specifically for http-triggered durable client functions. As an example:

const clientHandler: DurableClientHandler = async (context: InvocationContext, request: HttpRequest, client) => {
    const instanceId = await client.startNew(request.query.get('orchestratorName'), undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

df.app.httpClient('durableOrchestrationStart1', clientHandler);

However, since client functions could be triggered by any trigger, it begs the question of should we provide similar APIs for every possible trigger, which would be a lot of burden on the SDK.

This issue is to discuss alternatives to build upon or improve the current experience.

ejizba commented 1 year ago

I think the main alternative that was missed was the option that mimics the old programming model. So something like this:

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client = df.getClient(context);
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [df.input.durableClient()],
    handler: httpStart,
});

The only place the user has to mention the client binding is when setting up the function (aka "function.json" in the old model and "extraInputs:" in the new model). Then when getting the client, the user doesn't have to pass in the binding config because under the covers we would search through the bindings for them.

Is it an elegant, cool new way of doing things? No, but it might be the easiest way to avoid a "downgrade"

jviau commented 1 year ago

I'd like to propose two other options. Keep in mind I am not very familiar with JS, so I cannot offer ways to implement this, or I am not aware if this goes against idiomatic JS.

  1. Keep it consistent with how you get input for other extensions:
const clientInput = df.input.durableClient();

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client = context.getInput(clientInput); // Can we add or use an existing extensibility point in the Functions JS SDK so that the DF SDK can hook in here and handle converting to the client type.
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput],
    handler: httpStart,
});
  1. More of an overhaul to JS SDK to be like C#'s IInputConverter

The goals of this one is to keep the same app.http, but let users add params to their function declaration which will match extraInputs in order. Again, extensions will need a hook to convert the input data into the desired final type.

const clientInput = df.input.durableClient();

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext, client: DurableClient) => { // not sure what the exact DurableClient type is.
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput],
    handler: httpStart,
});

I would want to avoid using a df.app.http here because this should be combinable with other extensions, like I could add a storage client as well:

const clientInput = df.input.durableClient();
const storageInput = // get storage input

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext, client: DurableClient, storage: StorageClient) => { // not sure what the exact DurableClient or StorageClient type is.
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput, storageInput],
    handler: httpStart,
});

I am aware above would need a bunch of work around getting the types right, to allow *Handler type to allow the extra inputs in the signature (like HttpHandler probably only has the two params defined on it - how do we make it accept arbitrary trailing params?). Maybe something like this is just not possible in TS/JS.

ejizba commented 1 year ago

In response to @jviau, specifically option 2. Here's the line that I interpret as the main difference:

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext, client: DurableClient) => {

You already alluded to it, but the main difficulty is getting the types right. We know when a user calls app.http that the two arguments to the user's handler will have to be request and context. We can "enforce" that for TypeScript with build errors, or at least provide Intellisense in the case of JavaScript. For the extraInputs, the type is not tied to app.http and so we have no way to know what these extra arguments would be on the handler. We would not be able to enforce the type or the number of extra inputs. This is basically what it would end up looking like:

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext, ...args: unknown[]) => {

I really don't like the args stuff because it adds confusion for the majority of users who don't use extra inputs. Durable is a bit unique in the sense that the client is currently an "extra" input but is really a "required" input.

On the flip side, we are able to provide some intellisense by making users get/set extra stuff on the context object. Details here. You might also notice that way more extra output types are supported than extra input types - that's because functions itself doesn't support many extra inputs (just one more sign that durable is a bit unique here).

I believe the main difference between .NET and Node.js is that .NET has decorators that are used for extra inputs and can impact the handler's signature and enforce what the extra input types are. Node.js historically has not had decorators - there are a few implementations in various stages of preview, but they are not production-ready let alone idiomatic. We have an issue here to explore in the future.

hossam-nasr commented 1 year ago

I want to share another proposal here by @cgillum:

Consistency is important for helping users learn this. If it doesn't make sense for durable to use context.extraInputs.get(...), could other bindings copy the durable model, for example, sb.getMessage(...) for service bus and cosmos.getDocument(...) for cosmos DB?

I would also add that this might go well with Chris's other proposal too to follow Durable's example when registering functions too:

Ideally there would be a consistent model for all trigger types so that Durable doesn't come across as being a special case. For example, sb.app.queue and sb.app.topic would make me feel better because then all extensions register trigger using the same basic pattern.

I think this would essentially flip the hierarchy of namespaces from type.service to service.type, so for example, this is how you might register a service bus topic trigger function with an http primary output, storage queue secondary output, and a blob input binding:

import { sb, storage, http, HttpResponse } from `@azure/functions`

const blobInput = storage.input.blob(/* blob options */);
const queueOutput = storage.output.queue(/* queue options */);
const httpOutput = http.output(/* HTTP options */);

sb.app.topic({
    // SB options
    return: httpOutput,
    extraInputs: [blobInput],
    extraOutputs: [queueOutput],
    handler: async function (message: unknown, context: InvocationContext): Promise<HttpResponse> {
        // read blob input
        const blobValue = storage.blob.getBlob(context, blobInput);
        // set queue output
        storage.queue.setMessage(context, queueOutput, `Hello, ${message}`);
        // return value mapped to http output
        return new HttpResponse({ response: 200 });
    }
});

While the durable example would remain the same, at least the rest of triggers/bindings would follow a similar pattern, so Durable is less of the odd one out. I don't mind making this change at all, and if anything, I feel like it might even be more intuitive to understand on its own, regardless of the durable question.

hossam-nasr commented 1 year ago

One more suggestion I want to throw on here: what if we add a method in the durable SDK to "transform" any method from @azure/functions that registers a function and accepts a handler of type FunctionHandler, into one that accepts a handler of type DurableClientHandler (i.e., a function with 3 arguments), sort of similar to the idea of promisify(). For example, here is it what it may look like to register client functions:

import * as df from "durable-functions"
import { app } from "@azure/functions" 

// register an HTTP-triggered client
const registerHttpClient = df.app.clientify(app.http) // haven't come up with a better name for this utility
const httpClientHandler: DurableClientHandler = async (request: HttpRequest, client: DurableClient, context: InvocationContext) => {
    // client function content, using client directly
};
registerHttpClient('httpClient', {
    handler: httpClientHandler
});

// register a blob-triggered client
const registerBlobClient = df.app.clientify(app.storageBlob);
const blobClientHandler: DurableClientHandler = async (msg: unknown, client: DurableClient, context: InvocationContext) => {
    // client function content
};
registerBlobClient('blobClient', {
    path: "blobs/file.txt",
    connection: "someConnString",
    handler: blobClientHandler
});

I'm not 100% sure if this can work while having the resulting function maintain the same type richness and intellisense of the original. I tried to hack together something, and I could get it to mostly work (it wasn't able to get the type of the handler property 100% correct), but I figured it's not worth investing extra time debugging if we won't pursue this further.

jviau commented 1 year ago

I think to clarify @cgillum's point offline regarding df.app.* and sb.app.* - I think it is more to make it consistent and not so much as favoring one form or the other. So, we could go the other direction as well and somehow extend app - app.http.*, app.storage.*, app.durable.*, app.cosmosdb.* etc. Although, not sure how that would be done. One way would be to add a singleton object off of app which can then be extended. app.register.http.*, app.register.storage.*, app.register.durable.*.

On a side note, I did find that app.get was a bit vague and suggest clarifying it as http somehow. What we are discussing should help fix that - http.app.get or app.http.get both clarify it.

davidmrdavid commented 1 year ago

Interesting suggestion, @hossam-nasr. That clientify is almost working like a Python decorator, so it's clever.

Taking a step back, it doesn't seem to me like we fully explored @jviau's first suggestion.

This one: "Keep it consistent with how you get input for other extensions:"

const clientInput = df.input.durableClient();

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client = context.getInput(clientInput); // Can we add or use an existing extensibility point in the Functions JS SDK so that the DF SDK can hook in here and handle converting to the client type.
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput],
    handler: httpStart,
});

@hossam-nasr / @ejizba: What prevents us from going this route? Is it simply that the resulting client object in this call would not have the right type, i.e that it needs to be deserialized still? I think that's something we can work around. Before going any deeper with suggestions, I want to understand if that's the main drawback.

jviau commented 1 year ago

Implementation wise for my suggestion - could the returned clientInput object be the input converter itself? @hossam-nasr concern was that additional context needed to be integrated into the clientInput, which is why they needed the different way to get it. What if there was a contract like so (forgive my bad JS... again):

interface InputConverter<T> {
    getInput(context: InvocationContext): T
}

interface InvocationContext { // probably not an interface
    getInput<T>(converter: InputConverter<T>): T
}

const clientInput = df.input.durableClient(); // this returns an InputConverter<DurableClient> now.
hossam-nasr commented 1 year ago

@jviau Responding to your earlier comment:

I think it is more to make it consistent and not so much as favoring one form or the other.

I get the point about consistency, but I personally favor this way of doing it sb.app.queue(/* */) over the other way, because the former is easier to implement from our side to achieve the same goal of consistency. It is much harder to provide an extensibility model to allow the durable SDK to modify the app namespace or some objects on it.

@davidmrdavid To your question:

What prevents us from going this route? Is it simply that the resulting client object in this call would not have the right type, i.e that it needs to be deserialized still?

More or less, yes. It is because the result of context.extraInputs.get(clientInput) would not return the client itself, but rather metadata that the SDK uses to create the client. To make it work, we either have to move client creation logic to the @azure/functions library (not ideal), or provide some extensibility model as above.

@jviau

Could the returned clientInput object be the input converter itself?

It took me a while to get what you're saying, but I think I get it now, and yes, it would be possible.

I see two main drawbacks of this though:

const myRandomInput = {
    type: "someRandomType",
    name: "someRandomName",
    someKey: "someValue"
};

app.http('httpFunction', {
    extraInputs: [myRandomInput],
    handler: (request: HttpRequest, context: InvocationContext) => {
        myRandomInputValue = context.extraInputs.get(myRandomInput);
    },
});

Admittedly, this isn't the worst drawback in the world, but the above example would've been the most useful for easier migration from old model to the new.

What I do like about this suggestion though is that it's a relatively easy* extensibility model to implement, and future-proofs us in case other SDKs/input types want to implement some sort of manipulation of the input themselves. Not sure how likely that is though.

* I'm not sure if there might be complications I'm not foreseeing for actually implementing this.

ejizba commented 1 year ago

I appreciate the feedback and discussion around app., but at the same time I'm fairly confident saying we won't change it at this time. We did a user study last summer exploring options of how to register functions, plus we've been sharing this with node.js partners, and we've received overwhelmingly positive feedback from Node.js folks with the current design. Using app as the main entrypoint (and specifically calling it app.get) are highly idiomatic and frequently used for Node.js apps. Here's a hello world sample for an Express.js app (arguably the most used node.js framework) that looks very similar to ours: https://expressjs.com/en/starter/hello-world.html

I also want to push back on the idea that df.app. is somehow bad. The df implementation is coming from a separate npm package and Node.js users should be extremely comfortable using a different prefix for functionality coming from a different package. This happens all the time in Node.js

Tbh, I'm getting a bit lost with all the suggestions in this thread. Could we maybe narrow the scope back down and fork unrelated discussions to separate issues?

jviau commented 1 year ago

@ejizba understood, we can start a different discussion regarding app and leave this one to focus on the client.

@hossam-nasr - can we support both modes? Make InputConverter optional and use overloads of context.getInput (or context.extraInputs.get, whatever it is) to determine if it implements InputConverter<T> or not?

hossam-nasr commented 1 year ago

Thank you @ejizba for narrowing our scope and getting us more focused. For separate discussions, I've added the app. feedback to this issue here: https://github.com/Azure/azure-functions-nodejs-library/issues/12#issuecomment-1408972221, and I've also created an issue to discuss adding some kind of extensibility to the app namespace here: https://github.com/Azure/azure-functions-nodejs-library/issues/55.

Now, focusing on the client question, I've tried narrowing down the proposed solutions, filtering out the ones that require major structural changes to the core node library APIs, since as @ejizba pointed out, those have been informed by user studies and feedback from node.js partners, so probably won't change.

Option 1: Essentially the equivalent of the old model

This suggestion matches, but not necessary improves upon, the way the old model worked. The df.getClient() function requires only the context object.

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client = df.getClient(context);
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [df.input.durableClient()],
    handler: httpStart,
});

Frankly, the only reason this wasn't the first default implementation we went with was because, unlike the old model, there wasn't a way to retrieve all input bindings from the context object in the new model. However, that's not a very hard feature to add and would make sense outside of durable applications too. Eric already has a PR out that adds this (https://github.com/Azure/azure-functions-nodejs-library/pull/54), and I have a version of the SDK based on that PR that can get the above code to work.

Based on this, let's make this option our new default baseline that we're comparing against, and we can decide if this experience is good enough already, or if we want to continue improving on it with one of the remaining options.

Option 2: Retrieve the client the same way as any other input

const clientInput = df.input.durableClient();

const httpStart: HttpHandler = async (request: HttpRequest, context: InvocationContext) => {
    const client: DurableClient = context.extraInputs.get(clientInput);
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

app.http('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    extraInputs: [clientInput],
    handler: httpStart,
});

There could be many ways to get this to work, but I think the clearest would be @jviau's suggestion of having extraInputs receive "input converter" types instead of FunctionInput (essentially just an options object) type.

Pros:

Cons:

Option 3: Provide trigger-specific APIs to register client functions

This options treats client inputs as more of an exception than other input binding types. Given that Durable client inputs are somewhat special (they are required, not optional, and there can be only exactly one), I would argue treating as special is justified.

const clientHandler: DurableClientHandler = async (context: InvocationContext, request: HttpRequest, client: DurableClient) => {
    const instanceId = await client.startNew(request.query.get('orchestratorName'), undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

df.app.httpClient('durableOrchestrationStart1', {
    route: 'orchestrators/{orchestratorName}',
    handler: httpStart,
});

We obviously can't do this for all trigger types. There was a query that Eric shared at some point that showed ~40% of durable client trigger types were HTTP, and ~26% were timer trigger. Given that, I would suggest we only do this for HTTP and timer triggers, covering ~66% of scenarios, while the rest of the triggers can be covered by the "unhappy" scenario.

Pros:

Cons:

Option 4: Generalizable version of option 3

Instead of doing it for one trigger type, we generalize it:

const clientHandler: DurableClientHandler = async (context: InvocationContext, request: HttpRequest, client: DurableClient) => {
    const instanceId = await client.startNew(request.params.orchestratorName, undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

df.app.client('durableOrchestrationStart1', {
    trigger: trigger.http({ route: 'orchestrators/{orchestratorName}' }),
    return: output.http({}),
    handler: clientHandler,
});

Pros:

Cons:

Note that option 3 and option 4 can co-exist. We could have option 3 for HTTP and timer triggers to cover most scenarios, and option 4 for other trigger types.

Option 5: A "clientify" API, similar to promisify()

We add an API to "transform" any method from @azure/functions that registers a function and accepts a handler of type FunctionHandler, into one that accepts a handler of type DurableClientHandler:

import * as df from "durable-functions"
import { app } from "@azure/functions" 

const registerHttpClient = df.app.clientify(app.http) // haven't come up with a better name for this utility
const clientHandler: DurableClientHandler = async (context: InvocationContext, request: HttpRequest, client: DurableClient) => {
    const instanceId = await client.startNew(request.query.get('orchestratorName'), undefined, request.text());
    context.log(`Started orchestration with ID = '${instanceId}'.`);
    return client.createCheckStatusResponse(request, instanceId);
};

registerHttpClient('httpClient', {
    handler: httpClientHandler,
    route: '/orchestrators/{orchestratorName}'
});

Pros:

Cons:

Proposal

I would personally favor option 3 for HTTP and timer triggers specifically. As mentioned earlier, I think treating the durable client input as special is justified, and having this utility for HTTP and timer triggers provides as close of a no-compromises solution as we can got for the large majority of scenarios. For the rest of the trigger types, I'm more on the fence, but I would favor either option 4 or option 2 (if feasible). I also don't think keeping option 1 is so bad 😛

davidmrdavid commented 1 year ago

Thanks @hossam-nasr. I agree we should stick with option 1 (already merged agaik) for now and revisit further improvements later. There's some good ideas here that will take time to fully evaluate.