Decouple triggers and runtimes

andresmgot commented 6 years ago

Currently for each new runtime we need to add a container image per trigger. We should design a runtime abstraction. So that:

Triggers can be added (http or event or something else).
These can be in a single language i.e golang
Runtimes can be added more easily

We need to define interface between trigger container and runtime. What type of protocol to use to pass the request and response.

deissnerk commented 6 years ago

Very interesting. Is the trigger container supposed to be a side car to the runtime or a pod on its own?

sebgoa commented 6 years ago

@deissnerk that's TBD. I think Andres has a proposal that he is going to submit as PR, so that we can discuss there.

andresmgot commented 6 years ago

@deissnerk we are proposing to split triggers and runtimes in different pods. This is the design document we have been preparing: https://github.com/kubeless/kubeless/pull/396

Any feedback is welcomed :)

sebgoa commented 6 years ago

@andresmgot can you submit this doc as a PR so that everyone can review and comment in the PR.

thanks

deissnerk commented 6 years ago

@andresmgot thanks for providing the document! The overall concept looks really promising to me. I have a few questions, though.

Cardinality of Trigger Pods

It is mentioned that there might be multiple functions per kafka consumer or ingress controller. In which cases would a new consumer be created? For an ingress, I suppose the ingress controller is handling this. Does this mean that there will be trigger controllers?

Message Delivery Semantics

The diagram in the document shows just an arrow pointing from the kafka consumer to the runtime but not back. I understand that this is because of the asynchronicity of messaging. I am not very familiar with kafka, but other message brokers distinguish at-most-once and at-least-once delivery of messages. For this purpose the message consumer has to send an ACK to the broker after successful processing of the message. To my knowledge OpenWhisk provides at-least-once for all function invocations. Wouldn't it be important for the triggers in your diagram to get notified if a function succeeds or fails and to repeat the invocation if necessary? The message delivery semantics could also be an additional piece of metadata when creating the trigger. One kind of failure of a function could be a timeout. So there might be a relation to #365 .

Function Input Using Ingress

In the document you suggest a format for the function input. How does this work in the case of HTTP? Who would take an HTTP request coming from a client and transform it into the internal request format?The standard ingress is not aware of this.

andresmgot commented 6 years ago

thank you @deissnerk for raising those points, this is how I see them:

Cardinality of Trigger Pods

It is mentioned that there might be multiple functions per kafka consumer or ingress controller. In which cases would a new consumer be created? For an ingress, I suppose the ingress controller is handling this. Does this mean that there will be trigger controllers?

Yes, we can call them "trigger controllers". We need an entity that register triggers and associate them to a runtime pod/function. For HTTP triggers, this role can be assumed by the ingress controller and for Kafka we will need to create that new entity that creates these Kafka consumers to trigger functions when messages are received under a certain topic. We are just talking about moving the Kafka consumer from the runtime image (where it is currently being created) to a centralized service.

Message Delivery Semantics

The diagram in the document shows just an arrow pointing from the kafka consumer to the runtime but not back. I understand that this is because of the asynchronicity of messaging. I am not very familiar with kafka, but other message brokers distinguish at-most-once and at-least-once delivery of messages. For this purpose the message consumer has to send an ACK to the broker after successful processing of the message. To my knowledge OpenWhisk provides at-least-once for all function invocations. Wouldn't it be important for the triggers in your diagram to get notified if a function succeeds or fails and to repeat the invocation if necessary?

You are right. The function can still return a value or an error and the Kafka consumer can retrieve that.

The message delivery semantics could also be an additional piece of metadata when creating the trigger. One kind of failure of a function could be a timeout. So there might be a relation to #365 .

Indeed. That's a valid use case. We didn't get into specifics of the trigger definition and we may want to keep it simple at the beginning but it is something important to have in mind.

Function Input Using Ingress

In the document you suggest a format for the function input. How does this work in the case of HTTP? Who would take an HTTP request coming from a client and transform it into the internal request format?The standard ingress is not aware of this.

Yes, it is not possible to do it in the Ingress controller so it should be the runtime container the one that will receive a simple HTTP request and should parse it into the standard format.

Let me know if you agree or have other concerns, I will update the document.

deissnerk commented 6 years ago

@andresmgot

Yes, it is not possible to do it in the Ingress controller so it should be the runtime container the one that will receive a simple HTTP request and should parse it into the standard format.

This way there would be different interfaces for HTTP and other triggers, right?

Have you looked at #241, yet? I wonder, if istio could somehow be a nice addition here.

andresmgot commented 6 years ago

This way there would be different interfaces for HTTP and other triggers, right?

I was thinking on having the the same interfaces:

Between the "trigger controller" and the runtime: A simple HTTP request with custom headers
Between the runtime and the function: The "parameters" object specified in the document The runtime should have information enough to build the parameters object, that information can be specified by the request or by the environment.

Have you looked at #241, yet?

Not yet. I will give it a try and check the possibilities.

In other topic, I am going to open a PR with the document content. It will be easier to discuss there and once we have a final decision we can merge it and leave it in the repository for future reference.

andresmgot commented 6 years ago

PR with the document: https://github.com/kubeless/kubeless/pull/396

andresmgot commented 6 years ago

This can be closed with v0.5.0 and #620

vmware-archive / kubeless