Closed slinkydeveloper closed 3 years ago
I'd highly appreciate this for our v1
API shape
❤️ this
I'm starting to lean towards using Javascript. I know it may sound crazy, but:
Another alternative could be using Json Schema, but I feel this could make the predicates too much convoluted both to read and write:
predicate:
allOf:
- properties:
attribute-a:
const: aaa
- properties:
attribute-b:
const: bbb
@grantr put some thought on this sometime back. Might be worth syncing with him. IIRC he used CEL https://github.com/google/cel-spec, but there were some problems regarding scalability.
I checked out CEL, problem is that the only implementation out there is for Golang. I wish to avoid locking the "implementatibility" of Eventing APIs (in particular of the data-plane) to Golang only.
@slinkydeveloper The problem with the JavaScript filter engine is that you will have a hard time validating the content. Not only you need syntactical validation, but functional checks as well. Worst would be that a filter JS expression shows to be wrong in seldom situations that haven't been tested and where you're live in production already
I'm totally on board with making the filter expressions more powerful. My $.02 is that this can be an additive (backwards compatible) change to the API and we do not have to get it in .16 where we bump the Trigger spec to v1.
wrt: javascript, I just can never remember how many = I have to specify to actually get the sane semantics :) jk
Not only you need syntactical validation, but functional checks as well
We can perform a pretty simple validation on control plane side trying to execute the predicate, giving to it an dummy cloudevent and check if it fails badly with an error or returns a boolean value.
wrt: javascript, I just can never remember how many = I have to specify to actually get the sane semantics :) jk
Well, because all/most attributes (in the first iteration at least) are supposed to be compared strings, you just need ==
Not only you need syntactical validation, but functional checks as well
We can perform a pretty simple validation on control plane side trying to execute the predicate, giving to it an dummy cloudevent and check if it fails badly with an error or returns a boolean value.
well, the filter could behave differently on Wednesdays when the temperature is below freezing point ;) To prevent that you'd have to limit Javascript to be able to only use builtins that you provide
We had a lot of discussions about filtering in the past. I encourage you to look for all the open issues related to this topic.
A good starting point is Grant's proposal: https://github.com/knative/eventing/blob/master/docs/broker/filtering.md
I remember one thing we talked about is being able to do filtering at the source, in particular when the source is a SQL database.
this can be an additive (backwards compatible) change to the API and we do not have to get it in .16 where we bump the Trigger spec to v1.
+1 on that @vaikas
I'm starting to lean towards using Javascript
I think this would be problematic in the long run. Lots of scope for unexpected pauses and a shiny magnet for injection attacks.
I agree with @vaikas and others on timing: This is important but seems like it can be a backward compatible addition post-v1. I welcome contrary opinions. :smiley:
I commented on the (in)feasibility of Javascript as a filtering language at https://github.com/knative/eventing/blob/master/docs/broker/filtering.md#javascript-or-lua:
Javascript and Lua programs may not provably terminate, may have memory leaks, and may consume excessive amounts of CPU.
To be fair, this line of argument is mostly sourced from the authors of CEL so there's some selection bias. I personally haven't attempted to embed javascript in a program safely.
I checked out CEL, problem is that the only implementation out there is for Golang.
There's also a C++ implementation, which technically makes it possible to embed in Java via JNI, but I think the general point is valid.
One possibility for making this less of an issue is to allow filters to be evaluated in a separate process. If we had a protocol to execute filters remotely, then the safety and ease of implementation is less critical. Combine that with a broker-specific or cluster-wide registry of filter types and evaluation services, and now any event producer or consumer can execute any filter.
# spitballing, this doesn't actually exist
spec:
filters:
attributes:
type: BuiltIn
javascript:
type: Remote
ref:
apiVersion: v1
kind: Service
name: javascript-event-filter
By focusing on making it easier to introduce a new filtering mechanism, we can avoid deciding which expression language is best for all time :grin:
Even if we have this custom filtering mechanism, I think there should still be a baseline set of filters that work everywhere (probably not an EL), but that's a discussion for another issue.
Big fan of the idea of having the possibility to use a remote filter here ☝️ It might not be the most performant approach, but it surely avoids implementing something that's either too opinionated or not powerful enough.
One possibility for making this less of an issue is to allow filters to be evaluated in a separate process
If we say that the filter is a remote service, we don't need it to be a JS specific thing. But that increases the usage complexity, the roundtrips required to serve a message and the filter service capacity must be pretty big. I feel we could mitigate a bit the problem with some engineering (sidecar service with UDS HTTP/2 connection for example), but you're still introducing a performance penalty on the hot path.
Javascript and Lua programs may not provably terminate, may have memory leaks, and may consume excessive amounts of CPU.
I recognize that, as soon as the language is turing complete, the user could always create some kind of "poison pill". But these engines have countermeasures for that, like https://github.com/robertkrimen/otto#halting-problem.
There's also a C++ implementation, which technically makes it possible to embed in Java via JNI, but I think the general point is valid.
Yeah exactly, I think CEL is an interesting language and definitely suits the filtering capabilities we need but, given the available libraries out there supporting it, it could be a serious problem for the adoption of Knative APIs.
By focusing on making it easier to introduce a new filtering mechanism, we can avoid deciding which expression language is best for all time
So are you proposing to make the filtering capabilities set "extensible" so then every broker can implement its own set? At the first glance, this definitely helps the adoption, but it sounds like a problem for the users both for portability of applications (from one broker impl to another, eg from our in memory broker for PoC to a production broker impl) and for the tooling around Knative (UIs, CLI, etc). I know it may sound hard as a goal, but I would prefer to find a powerful filtering solution interoperable across different broker implementations.
related topic: in case we get Turing-complete filters, please also consider result filters to decide on redelivery or posting to different channels
I am curious to read a clear use-case for this. What type of events getting in and why do you need advanced filtering. I have my views but it would be nice to read it clearly in this issue.
AFAIK OPA policy language is used in Kubernetes already - is it expressive enough for filter expressions? Deny/Allow should map well to filtering events? They have to deal with making sure resources used for OPA execution are wll controlled as nobody wants policy language runtime to impact cluster performance:
https://www.openpolicyagent.org/docs/latest/deployments/#kubernetes
AFAIR OPA is to be used for Knative Eventing security?
I propose a solution using JS here: https://github.com/knative/eventing/pull/3771 feel free to take a look and review
That looks ot me as non-trivial feature - I could not find google doc with the proposal?
https://knative.dev/community/contributing/mechanics/feature-tracks/#step-2-the-proposal
Was it maybe linking in some other issue?
My concern is how pluggable is the YAML and code? Could "opa:" or "js:" prefix for expressions be used?
My concern is how pluggable is the YAML and code? Could "opa:" or "js:" prefix for expressions be used?
In that case, just renaming the field expression
to jsExpression
seems reasonable and extensible enough
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
The CNCF Serverless WG is working on a proposal: https://github.com/cloudevents/spec/pull/771
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
There are many duplicates of the same issue. I'm cleaning old ones like this for the favor of the newest more up to date #5204. We can reopen if needed.
Problem Now the
Trigger
filtering abilities are poor, because they require direct match on specificCloudEvent
attributes.Persona: User
Exit Criteria
Trigger
APIs should have a more powerful filtering abilitiesTime Estimate (optional): 2
Additional context (optional) Together with @grantr we discussed the possibility to add an expression language to express the matching predicate. We need to find an expression language with implementations in different programming languages and that covers the basic string operations.
An example trigger could be: