Clarify the purpose of the Broker namespace reconcilers

mattmoor commented 4 years ago

Today knative/eventing supports reconciling namespaces to create "default" Brokers within "relevant" namespaces.

The "goal state" here has been a topic of some debate in recent months and we need a clear articulation of the "goal state" that we can discuss and debate, and use to frame future judgements.

cc @grantr @matzew @vaikas @antoineco @markfisher

mattmoor commented 4 years ago

In my eyes 👀 , the "goal state" of the namespace reconciler is a DevX feature that aims to reduce the number of concepts users deal with by enabling an ambient "default" Broker in each namespace.

The two key elements of this are:

Creating a "default" Broker in each namespace, and
Default the Trigger's spec.broker to be this "default" Broker (when one isn't specified).

Historically, 1. saved us a whole lot of complexity, but it wasn't cheap enough to simply run everywhere, so we compromised on an "opt-in" model that saved us some complexity and cost, but at the expense of having to enable this in every namespace in which it was used and an unpleasant failure mode (for 2.) when we forget to do so.

However, these days creating Brokers (with defaults) is trivial, and with the MT Broker the per-Broker cost is effectively nothing. So it feels like we suffer the downsides of the compromise without the upsides.

My position on this is simply: we should re-evaluate these trade-offs before we head into a v1 saddled with the debt of older designs.

I'd appreciate it if others would share what they view as the "goal state" of the namespace reconciler work, and what they feel it buys us over everything just being explicit (no magic). 🙏

vaikas commented 4 years ago

TL;DR: I agree with the goal state roughly but will add a few more words and maybe just a few wrinkles to the mix.

From the point of view of the Operator, they would like to have an option to make the Eventing experience seamless and consistent for their users. For this to work, they will want to have an option (explicitly stating that this is not required) to control the Eventing experience for some set of namespaces. This could be all, none or some in between. For this obviously they would like to have a tool that roughly looks like today's mtnamespace labeler. There are few things (imho) missing from there*, but by having a tool will indeed remove the cognitive load for the user because they do not have to understand what / how the Broker is configured. But, as pointed out by @mattmoor, unless we do some defaulting in conjunction with this defaulter, we only solve one half of the equation. To remove preconceived notions with the existing functionality as well as roles and responsibilities, let's call this b0rk3r.

So, I guess I'd lean towards a solution where the existing trigger_defaults do not specify "default" as the broker. Why would we do this? Because b0rk3r might want to configure the the default Broker to not be called "default" for one, but more importantly unless we make this change it seems like we only solve one part of the equation. While that might not be a burning usecase, the b0rk3r should imho be something that does not only solve the creation of the default Broker, but also ensures that the user experience when creating Triggers is also consistent with how it has been configured.

If we now say that b0rk3r is responsible for creating the Broker as well as defaulting Triggers without a specified Broker then to me it seems like it's clear who does what and why. Note, that by simply making this a separately installable component, we can still provide all the existing users the same capabilities out of the box if we want to, while still having leeway for folks that do not want this behaviour to opt-out, either by configuring the b0rk3r to suit their needs, or by not installing it at all.

While I agree that the creation of Brokers with defaults is trivial, I do foresee use cases where the "Eventing experience" provided to the users by the operator is richer. For example, you might want to create the Broker, as well as create a DLQ that might log those failed events, or put them into a channel, etc. So, I do still foresee a tool that would provided added value, and hence why I think it should be a separate component. This also means that operators can plug in their implementations on top of the core.

vaikas commented 4 years ago

added to agenda for 2020-05-20

antoineco commented 4 years ago

I like the idea of delegating the broker management magic to an opinionated component outside of core. We could still have a default b0rk3r implementation which does what Matt suggests ("brokers are cheap so let's have a default instance in every namespace"), and which one would have to run separately from core. People in need for something different could still run their own b0rk3r.

mattmoor commented 4 years ago

Yeah, I just wanted to set clear expectations around the goals of running the namespace reconciler. To me it only makes sense to run it if it's stamping things out everywhere (opt-out). If we need to take action per-namespace, then sticking with the pattern folks are familiar with for creating resources feels more right to me.

I think that we can separate the namespace magic from core and have our cake and eat it too:

"core" does zero namespace magic, Triggers don't default broker names and validation fails if broker name is unspecified.
"nsmagic" can be installed separately to create brokers in labelled (or unlabelled) namespaces, and patch triggers with a default broker name when they are created in an appropriately labeled namespace.

I believe that "nsmagic" can also be used to bridge the compatibility gap so that this change is not breaking (to @grantr's concern).

akashrv commented 4 years ago

Agree with both b0rk3r and simplifying default experience. Couple of extra things to think about

Could the b0rk3r options be exposed through the operator? For end-user ( cluster operator), Operator (component) is the entry point.
The more I think about broker experience, I think ability to have multiple brokers per namespace is a good technical feature, but for most use cases having a default broker per namespace is what users need. See EventBridge as an example. I would suggest to have a b0rk3r that makes end users' life super easy for 80% scenarios - always have a Default broker available in each namespace if Eventing is installed.

akashrv commented 4 years ago

If I could extend this user experience (referring to #3097 ). Could we have some form of extended version of sinkbinding implementation (lets call is AddressableBinding for the discussion) where I can configure multiple ENV_VAR <--> Addressable pairs? Then as part of b0rk3r we can have an instance of this AddressableBinding, such that application code can always have handy Env_Var that resolve to Broker URL. For the default B0rk3r we can decide the name of the env variable, and other B0rk3rs can re-configure them.

Did that make sense? If not I can elaborate.

markfisher commented 4 years ago

I am in the "no magic" (at least for now) camp. I believe it's too early to know what the common pain points are for Brokers until a couple more implementations exist. I've been working on a RabbitMQ Broker/Trigger, and I've been discussing a few things with @grantr along the way as we're in somewhat of a similar boat delegating to external messaging systems directly from the Broker/Trigger reconcilers (GCP PubSub in his case).

We'll have a better understanding of how explicitly operators will need to configure brokers, how many distinct types of brokers they may install, how they allocate those across namespaces alongside their users' workloads, and so on - once more options exist and we see more feedback from users building distributed applications and eventually taking those to production. So I'm definitely +1 to @mattmoor's idea of keeping the magic out of core, but I would also like to see the GCP and RabbitMQ Brokers (maybe others, Kafka?) get some use before things go too far even as a separate extension for the magic (and for the record, I intend to submit a PR for moving that RabbitMQ Broker into the eventing-rabbitmq repository as soon as the initial source contribution settles there).

While developing, installing and using the RabbitMQ Broker, I've been happy with kubectl apply (see the README), as it seems simple enough, and since it's delegating to RabbitMQ as the data plane, nothing else gets created in the cluster anyways. Of course, that has been in the very limited scope of dev+testing, hence my comments about wanting to see how operators and developers interact with a growing variety of Broker implementations.

grantr commented 4 years ago

/project v1

knative-prow-robot commented 4 years ago

@grantr: A column is not provided or it's not valid for the project v1. Please provide one of the following columns in the command: [To do In progress Done]

In response to [this](https://github.com/knative/eventing/issues/3138#issuecomment-634797096): >/project v1 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

grantr commented 4 years ago

/project v1 To do

grantr commented 4 years ago

This is a great discussion; thanks everyone for participating! Here's how I understand the sentiments presented so far:

There is uncertainty about how users and operators will want to use Eventing and how contributors will implement Brokers.
The uncertainty contributes to a lack of consensus about how broker defaulting could be made useful.
The lack of consensus leads to universal agreement that defaulting of trigger's broker name and injecting brokers into namespaces should be removed from core.

I agree with all of this. Removing defaulting from core seems like the right call for every persona. I hope it will lead to better experiences for everyone in their chosen environment and workflow.

Concretely, I think we should take these actions:

Move the namespace reconciler feature to a separate released component or guard it with an opt-in feature flag.
Move the broker creation on trigger creation feature to a separate released component (perhaps the same as above) or guard it with an opt-in feature flag.
Remove the defaulting of spec.broker for Trigger v1. The existing validation of spec.broker will ensure that it's always present. We leave the defaulting in v1beta1.

I think this addresses my concerns from https://github.com/knative/eventing/issues/2970#issuecomment-628264415:

There are stakeholders relying on the current behavior that are not included in the discussion

Possibly still true, but by making the change in v1 only we give them time to understand why the change was made and develop alternate solutions.

Removing the default is a backward incompatible change to the API.

Not if we make the change in v1 and leave it for v1beta1.

aslom commented 4 years ago

The uncertainty is more than how it operates but also what to do when it does something unexpected or do nothing (most common problem ...). I think it is related to improving Observability https://github.com/knative/eventing/issues/3387 and in general to user experience for brokers https://github.com/knative/eventing/issues/3455

knative / eventing

Clarify the purpose of the Broker namespace reconcilers #3138