dapr / proposals

Proposals for new features in Dapr
Apache License 2.0
15 stars 33 forks source link

Proposal: serverless compute target #16

Open ItalyPaleAle opened 1 year ago

ItalyPaleAle commented 1 year ago

This is a proposal for being able to start a Dapr runtime ("daprd") as a proxy to serverless compute resources, a collective term that includes things like Azure Functions, AWS Lambda, Cloudflare Workers (and workerd), OpenFaaS, etc.

The goal is to allow developers to build apps that can leverage serverless compute in a smooth way, transparently integrating with Dapr. It allows building apps that can leverage flexible, on-demand compute resources, as a target for Dapr service invocation, PubSub messages, or input bindings.

This will be implemented by allowing Dapr to use a new class of components, called "compute", as a target for Dapr API calls, instead of an app running side-by-side with the Dapr runtime.

olitomlinson commented 1 year ago

This is a proposal for being able to start a Dapr runtime ("daprd") as a proxy to serverless compute resources, a collective term that includes things like Azure Functions, AWS Lambda, Cloudflare Workers (and workerd), OpenFaaS, etc.

I've read the proposal and I'm unsure about why there is so much emphasis on the word 'serverless' ?

I can only assume that because of the emphasis on serverless, It won't be possible to proxy onto any traditional web server that exposes HTTP endpoints?

To me the underlying host technology (serverless or otherwise) is not important if daprd is just invoking HTTP endpoints on the target?


why must the CRD need to know about the external computes infrastructure via the Type attribute i.e compute.azure.functions ? For example, why would daprd operate differently if the target was an Azure Function vs an AWS Lambda?

yaron2 commented 1 year ago

This is a proposal for being able to start a Dapr runtime ("daprd") as a proxy to serverless compute resources, a collective term that includes things like Azure Functions, AWS Lambda, Cloudflare Workers (and workerd), OpenFaaS, etc.

I've read the proposal and I'm unsure about why there is so much emphasis on the word 'serverless' ?

I can only assume that because of the emphasis on serverless, It won't be possible to proxy onto any traditional web server that exposes HTTP endpoints?

To me the underlying host technology (serverless or otherwise) is not important if daprd is just invoking HTTP endpoints on the target?

why must the CRD need to know about the external computes infrastructure via the Type attribute i.e compute.azure.functions ? For example, why would daprd operate differently if the target was an Azure Function vs an AWS Lambda?

Your feedback is spot on. There's nothing "serverless" about the features and specification pointed out here and there's no common ground between them. This is basically about adding an authentication mechanism to external, non-Dapr HTTP endpoints, and is really a nice extension to my proposal here which augments it.

There may be differences in how to authenticate or call an Azure Function or AWS Lambda, but the differences are semantics of an HTTP endpoint, not a serverless platform.

To illustrate this further:

  1. An organization wants to use service invocation to an external proprietary system via HTTP and has specific authentication
  2. An organization wants to use service invocation to invoke any HTTP cloud service that's not considered "serverless", ie. Azure App Service, AWS Elastic Beanstalk etc
  3. An organization wants to use service invocation to invoke an API gateway with custom auth, whether internal or external to their org
  4. An organization wants to use service invocation to invoke a non-Dapr HTTP service on Kubernetes/VM

These examples are very common use cases and non are considered serverless, yet all of them could be enabled if Dapr could invoke external non-Dapr endpoints and had the proper authentication configuration to do so, which this proposal covers well.

So my main feedback here is to join this proposal with mine (which I am going to transfer here from https://github.com/dapr/dapr/issues/4549) as it adds important missing pieces to it. I'm open to co-authoring @ItalyPaleAle.

ItalyPaleAle commented 1 year ago

Thanks for the feedback.

I think dapr/dapr#4549 has some similarities but they are trying to solve different problems. I don't think they can be bundled together, to be honest.

The core part of this proposal is the change to the Configuration CRD to have Dapr initialize an app channel to an arbitrary HTTP(S) endpoint. It will still require a daprd process for each app, but it won't require a locally-running (ie. in the same pod) app and daprd will initialize an app channel towards an external HTTP(S) endpoint.

Here's how I see the two proposals, side-by-side:

dapr/dapr#4549 This proposal
Only involves service invocation Applies to service invocation and inbound pubsub and binding messages
Dapr on caller app invokes the external endpoint directly (For service invocation) A daprd runtime is required on the callee side too. The caller daprd invokes the callee daprd
High degree of flexibility for making service invocation calls Very low degree of flexibility: target serverless code expects inputs received in a very specific format
Goal is to allow invoking endpoints outside of Dapr Goal is to allow developers to scale their apps by leveraging "serverless" compute resources

In this case, it's sort of required for the target to be a serverless compute endpoint, if not technically at least "philosophically" so it achieves the goals explained above.

yaron2 commented 1 year ago

Thanks for the feedback.

I think dapr/dapr#4549 has some similarities but they are trying to solve different problems. I don't think they can be bundled together, to be honest.

The core part of this proposal is the change to the Configuration CRD to have Dapr initialize an app channel to an arbitrary HTTP(S) endpoint. It will still require a daprd process for each app, but it won't require a locally-running (ie. in the same pod) app and daprd will initialize an app channel towards an external HTTP(S) endpoint.

Here's how I see the two proposals, side-by-side:

dapr/dapr#4549 This proposal Only involves service invocation Applies to service invocation and inbound pubsub and binding messages Dapr on caller app invokes the external endpoint directly (For service invocation) A daprd runtime is required on the callee side too. The caller daprd invokes the callee daprd High degree of flexibility for making service invocation calls Very low degree of flexibility: target serverless code expects inputs received in a very specific format Goal is to allow invoking endpoints outside of Dapr Goal is to allow developers to scale their apps by leveraging "serverless" compute resources In this case, it's sort of required for the target to be a serverless compute endpoint, if not technically at least "philosophically" so it achieves the goals explained above.

I still think it makes sense to combine the proposals into one, but regardless of https://github.com/dapr/dapr/issues/4549 at all, I would strongly advise to focus this proposal on the ability to have Dapr invoke non-localhost HTTP endpoints for service invocation, bindings and pub/sub and remove any mention of serverless all-together as there are no serverless semantics here, just HTTP + custom auth ones. Until then, this is a showstopper for me.

olitomlinson commented 1 year ago

I agree, So far...I don't see how the target being serverless or not is relevant to the proposal?

Being able to use burstable/serverless/FaaS to do work at scale is a nice by-product of being able to invoke a target that is not in the cluster. It's certainly a handy use-case. But yah, I can't see why the serverless aspect takes such importance in the proposal?

As it stands, the propsed component name of compute is feels wrong to me, as dapr isn't creating compute. Nor is it just leveraging compute. It's calling some pre-deployed business logic/domain/system/app, which will likely have an entire SDLC attached to it, so compute feels like a misnomer...

The idea of the external target being represented by an app-less pod (with just a dapr sidecar) is interesting, and I'm ok with that. It gives the appearance that the external resource is internal to the cluster which is nice. I can certainly see how this fits nicely to allow the external target system to inherit PubSub subscription capabilities / input bindings.

However, the idea runs out of runway if the external app can't originate its own outbound HTTP requests back into that dapr sidecar to leverage the full gamut of sidecar capabilities...? Might have to be careful here about how this is positioned to not create confusion with potential users who think this is trying to establish a bidirectional dapr mesh.

Sorry if this sounds like I'm being overly critical, just sharing my initial thoughts :)