knative / eventing

Event-driven application platform for Kubernetes
https://knative.dev/docs/eventing
Apache License 2.0
1.41k stars 590 forks source link

Discovery of importers and how to create them #1550

Closed mikehelmick closed 4 years ago

mikehelmick commented 5 years ago

Problem In working through #1381 one of the key things that we have found when targeting the user experience is that discoverability is highly important. For motivations, we look to the Scenarios for Knative Eventing.

To recap the 3 scenarios describe there

  1. FaaS - directly connect an importer (specific instance) to a consumer

  2. Event-Driven - current implementation of Broker + Trigger

  3. Black-box integration - scenario 3 is a specialization of both scenarios 1 and 2. From a developer’s viewpoint, consuming events is just like scenario 2 (Event-Driven) in that if an event is available on my cluster, I am able to trigger off of it.

The current registry implementation is event type centric (the CRD is literally called EventType). This works really well for scenario 2 (on-cluster, event-driven scenario) and scenario 3 (black-box integration).

This works less well for scenario 1 (FaaS) because it makes discovery of events via source more difficult to build. This doesn’t mean that the existing registry is wrong our less useful. There is currently one entry in the registry per event type, this is actually a tuple of Source and Type.

We propose introducing a new importer registry.

This gives the overall registry 2 components working together to facilitate discovery and configuration of events. The importer registry will be responsible for informing a user (and tooling) of which producers they may solicit what events from and how to configure those importers.

When an importer is configured, the importer is responsible for populating the EventType appropriately.

As a concrete example, the importer registry tells me that I can get Google Cloud Storage “finalize” events by instantiating the GCSSource kind. If two GCSSource objects are created for finalize events, the finalize event type should only appear in the EventType registry once. The EventType should reflect which importers are providing that type (can be an annotations, status, etc. to be designed).

Persona: Event Consumer

Exit Criteria A CLI can be written that can programmatically discovery and configure importers.

Time Estimate (optional): 5d

rhuss commented 5 years ago

The idea is that CRs only populate spec, as is typically done. We are hacking the CRD to include additional info for registry purposes, but we do not expect users to configure in their CRs the registry section. I added a description there saying something like: "Internal information for registry purposes. Users should not set this property"

tbh, that is indeed very hackish and totally confuses types with values. Also misusing the openAPI "pattern" field to be used as concrete values does not feel right. According to the openAPI documentation pattern "SHOULD be a valid regular expression, according to the ECMA 262 regular expression dialect" (so e.g. "." should be escaped if they are meant to be dots like in google.storage.object.archive) and its purpose is according to the schema validation specification :

5.8.  pattern

   The value of this keyword MUST be a string.  This string SHOULD be a
   valid regular expression, according to the ECMA 262 regular
   expression dialect.

   A string instance is considered valid if the regular expression
   matches the instance successfully.  Recall: regular expressions are
   not implicitly anchored.

Although it's only a recommendation (SHOULD), I wonder why we wouldn't follow that. Also, the purpose of this field is for validation of a given value, not as a value for itself.

To be honest, I would expect a cleaner design for an official API and not marking parts of a CRD as 'internal' via documentation and re-intepreting an openApi schema. Why not choosing a new entity which contains a CRD and the event-type values as two distinct objects as properties (like eg more or less a ClusterServiceVersion managed by the Operator Lifecycle Manager does) ?

Only because something is technically possible doesn't mean it should be (mis)used for any purpose.

 gcsCredsSecret:  
    name: my-key
    key: key.json

How would this be translated to the CLI ? like in --arg gcsCredsSecret='{name:"my-key", key:"key.json"}' ? and the client trying to parse the argument as json and putting it as value into the gcsCredsSecret field ?

This might work in this simple example, but for more complex sources (as mentioned the Camel source comes to mind), I'm afraid that this quickly becomes ugly, leading to huge CLI command lines which not much benefit over having plain yaml files.

rhuss commented 5 years ago

If insisting of this design (but still feels wrong, as any field which should not be populated by a user, like status: should not be part of a validation schema), then maybe instead of pattern --> default could be used as value field as this is a plain string without implied semantics. And then this registry: field should be populated for a given CR by the controller with these values as read-only reference maybe ?

matzew commented 5 years ago

I would expect a cleaner design for an official API and not marking parts of a CRD as 'internal' via documentation and re-intepreting an openApi schema.

I agree with this concern

matzew commented 5 years ago

could this be as label or annotation for stashing this info? IMO it's a pitty that the user is not allowed to set the field, and kinda make that in internal, but exposed, API

rhuss commented 5 years ago

E.g. I could imagine to use annotations of the form

annotations
   knative.dev.source.eventType.addResource: "dev.knative.apiserver.resource.add"
   knative.dev.source.eventType.deleteResource: "dev.knative.apiserver.resource.delete"
   ....

and then a query on all annotations with a prefix knative.dev.source.eventType.

matzew commented 5 years ago

I like that suggestion, @rhuss

n3wscott commented 5 years ago

or like

annotations
   registry.knative.dev/eventType.addResource: "dev.knative.apiserver.resource.add"
   registry.knative.dev/eventType.deleteResource: "dev.knative.apiserver.resource.delete"
   ....
nachocano commented 5 years ago

Thanks @matzew for reviving this thread. And thanks @rhuss for all your comments. I've been working on observability so I lost track of this.... I think we might be able to use annotations for this, you guys are right. And it might be cleaner. The reason why we didn't in the first place was to "address" the case where you might need different configuration params for different event types (some required, some not). I think that probably won't happen, and if it does, we can revisit later on...

The annotations should be able to include more things than just the type, also the schema, a description, and maybe some others.. But I don't see any problems with that...

Let's move this one to 0.10 milestone so that we can have reach an agreement and then get this in asap.

grantr commented 5 years ago

/milestone v0.10.0

sixolet commented 5 years ago

I like the idea that if you need different configuration params for different event types, that's a different importersource CRD.

nachocano commented 4 years ago

Resuming this thread so that we can reach an agreement and close this... How about adding a single registry annotation to source CRDs with all the info we were previously putting inside the openAPIV3Schema schema. It will look something like:

annotations:
    registry.knative.dev/eventTypes: |
      [
        { "type": "dev.knative.apiserver.resource.add", "schema": "my-schema", "description": "blah"},
        { "type": "dev.knative.apiserver.resource.delete", "schema": "my-schema", "description": "blah"},
        ...
      ]

where the annotation value is a valid json object. In particular, a list of "EventTypes". Any tooling will be able to easily unmarshall it...

Example in go:

type EventType struct {
    Type string `json:"type"`
    Schema string `json:"schema,omitempty"`
    Description string `json:"description,omitempty"`
}
var ets []EventType
if err := json.Unmarshal([]byte(obj.annotations["registry.knative.dev/eventTypes"]), &ets); err != nil {
   log.Printf("invalid json: %v", err)
}

Would that work? @rhuss @matzew @n3wscott @vaikas-google

rhuss commented 4 years ago

I am ok with this approach as this approach worked nicely in previous iterations of Kubernetes (anyone remembering init-containers done as annotations ?). However, in practice, these JSONs can become quite unwieldy as the formatting never stays the same and everything becomes a single line once applied. This was hard to maintain, but as this is not supposed to be edited once created (at least not "live") and is not for human consumption I would be fine with this single-key approach.

When doing it this way we should then add the meta-data required for the CLI to map flat arguments to jsonpath pointers into the CR also similarly with e.g. an annotation "cli.knative.dev/options". I would add this as a proposal to #1940 .

Also, annotation value size limitation (256k chars) should be no issue here ;-)

nachocano commented 4 years ago

Thanks @rhuss , @matzew @n3wscott and others for all the comments. Upon discussions, we will go with the approach mentioned above and agreed by @rhuss . If we need something more advanced in the future, we can certainly revisit and see if we should introduce new object(s). For now, this is sufficient to cover the discoverability use case.

As soon as https://github.com/knative/eventing-contrib/pull/631 gets merged, I'll close the issue.

matzew commented 4 years ago

/close

As soon as knative/eventing-contrib#631 gets merged, I'll close the issue.

It got merged :tada:

knative-prow-robot commented 4 years ago

@matzew: Closing this issue.

In response to [this](https://github.com/knative/eventing/issues/1550#issuecomment-535852072): >/close > >> As soon as knative/eventing-contrib#631 gets merged, I'll close the issue. > >It got merged :tada: Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.