shipwright-io / build

Shipwright - a framework for building container images on Kubernetes
https://shipwright.io
Apache License 2.0
640 stars 109 forks source link

Admission control webhooks for synchronous validation #596

Open imjasonh opened 3 years ago

imjasonh commented 3 years ago

Kubernetes allows controllers to register as admission control webhooks, which get called synchronously when resources are created or updated, and can reject invalid objects from being created, or modify requests on-the-fly to set defaults for example.

Shipwright doesn't currently configure an admission control webhooks, so users can currently create invalid resources (e.g., a BuildRun that references a Build that doesn't exist). Shipwright's controller will notice these and correctly update them to a failed status, but that feedback isn't immediate to users.

The controller-runtime pkg/webhook package provides methods to register a webhook.

gabemontero commented 3 years ago

https://kubernetes.slack.com/archives/C019ZRGUEJC/p1613568667090500 - a slack conversation between @ImJasonH , @otaviof , and myself at the time of this comment

some almost tl;dr level details in there :-)

qu1queee commented 3 years ago

We currently do a lot of validations in advance on the Build side, so that users can get immediate feedback on their configuration, prior to the BuildRun execution.

Because a BuildRun spec is really small, it seems that one of the webhook advantage for us would be

telling users that a referenced Build is missing

but in terms of miss-configuration, there is not too much more benefit.

Is my understanding that Tekton does not validate the Task CRD contents during the creation of that resource (contrary to what we do in Build), but only when a TaskRun is created. Does this different flow make the webhook approach more adhoc to Tekton, but not to Build in terms of configuration validation?

If the webhook can block users from creating a BuildRun because of a missing Build, I think there is a value on this. However dealing with a webhook only for doing that single verification, sounds too much for a minor benefit.

Opinions?

EDIT: My above comment is mainly around validating config via webhooks. I understand the webhook have other benefits, e.g. default values definition.

imjasonh commented 3 years ago

Is my understanding that Tekton does not validate the Task CRD contents during the creation of that resource (contrary to what we do in Build), but only when a TaskRun is created.

Tekton validates Task definitions as well:

$ cat task.yaml
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: test
spec:
  steps:
$  kubectl apply -f task.yaml
Error from server (BadRequest): error when creating "task.yaml": admission webhook "validation.webhook.pipeline.tekton.dev" denied the request: validation failed: missing field(s): spec.steps

And I think Shipwright should also synchronously validate Builds as well.

qu1queee commented 3 years ago

@ImJasonH I see. Yes, this will definitely be nice to have for both Build/BuildRun. I know Tekton have a controller to generate webhook certificates, is this something that you think we also need for the admission one?

imjasonh commented 3 years ago

Tekton uses knative/pkg's webhook package to create and manage webhook certs, which has been helpful (code here).

I honestly don't know much about how webhooks (certs especially) are configured for other frameworks like controller-runtime.

gabemontero commented 3 years ago

Tekton uses knative/pkg's webhook package to create and manage webhook certs, which has been helpful (code here).

I honestly don't know much about how webhooks (certs especially) are configured for other frameworks like controller-runtime.

I'll dig up a non-knative reference and post it here.

Technically speaking I think we could use the knative plumbing to build our cmd/webhook image/depoyment/etc. I used that to positive effect back when I was trying to get some RBAC related scenarios addressed in tekton directly vs. the keep it out of tekton OPA approach the community currently is on. I don't remember it having specific dependencies on the framework used for the controller/reconciler, but perhaps you are aware of something I'm forgetting @ImJasonH

But even if it is possible, it would be good to compare the pros and cons of both approaches and get to a consensus on what is desired choice.

imjasonh commented 3 years ago

Technically speaking I think we could use the knative plumbing to build our cmd/webhook image/depoyment/etc

Yeah, I also think we could. I do think eventually we'd prefer to only rely on one framework, and not mix controller-runtime and knative/pkg. So given that, I think we should start by trying to run the webhook job without relying on knative/pkg. Gabe, any more guides/examples you could provide would be helpful there. 👍

zhangtbj commented 3 years ago

I remember we had an initial discussion at the beginning, also cc @sbose78 , at that time, we try to avoid introducing the webhook to make shipwright build simple.

Maybe we can discuss that again.

I think for shipwright build, I don't know if we will introduce more parameters in future.

But for the existing parameters, I am not sure if the default schema validation is ok for us, like this one: https://github.com/shipwright-io/build/issues/474

gabemontero commented 3 years ago

Technically speaking I think we could use the knative plumbing to build our cmd/webhook image/depoyment/etc

Yeah, I also think we could. I do think eventually we'd prefer to only rely on one framework, and not mix controller-runtime and knative/pkg. So given that, I think we should start by trying to run the webhook job without relying on knative/pkg. Gabe, any more guides/examples you could provide would be helpful there. +1

@ImJasonH after checking out a few options, I'm a bit torn :-). I still like the basic building blocs provided by k8s, as noted in this blog post: https://kubernetes.io/blog/2019/03/21/a-guide-to-kubernetes-admission-controllers

Where the MutatingWebhookConfiguration is the glue that correlates your code to a service the API server will call.

And your impl has the start/validate/mutate entrypoints and the flexibility/power integrating at a "lower level".

That said, sure things like controller-runtime and knative supply some additional abstractions and functions (cert regen IIRC).

I think controller runtime's are at https://github.com/kubernetes-sigs/controller-runtime/tree/master/pkg/webhook

imjasonh commented 3 years ago

Someone shared this during the community call yesterday and I forgot who it was now (sorry!) but it seems useful: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation.html

gabemontero commented 3 years ago

Apologies to the non-Red Hatters, but we are having an impromptu deep dive on knative vs. operator-fw/OLM at https://coreos.slack.com/archives/C3VS0LV41/p1616609181255300

Between @ImJasonH and myself we need to provide a end synopsis after it settles

In the interim, some of those links @ImJasonH asked for back in https://github.com/shipwright-io/build/issues/596#issuecomment-782205975

imjasonh commented 3 years ago

Technically speaking I think we could use the knative plumbing to build our cmd/webhook image/depoyment/etc

Yeah, I also think we could. I do think eventually we'd prefer to only rely on one framework, and not mix controller-runtime and knative/pkg. So given that, I think we should start by trying to run the webhook job without relying on knative/pkg. Gabe, any more guides/examples you could provide would be helpful there. +1

@ImJasonH after checking out a few options, I'm a bit torn :-). I still like the basic building blocs provided by k8s, as noted in this blog post: https://kubernetes.io/blog/2019/03/21/a-guide-to-kubernetes-admission-controllers

Where the MutatingWebhookConfiguration is the glue that correlates your code to a service the API server will call.

And your impl has the start/validate/mutate entrypoints and the flexibility/power integrating at a "lower level".

That said, sure things like controller-runtime and knative supply some additional abstractions and functions (cert regen IIRC).

I think controller runtime's are at https://github.com/kubernetes-sigs/controller-runtime/tree/master/pkg/webhook

Only relying on k8s primitives certainly sounds nice, but then you end up building a lot of infra code that ends up taking up a lot of eng cycles.

controller-runtime seems to not include cert (re)generation by default, and expects clients to generate and manage those themselves. knative/pkg's webhook package (here) does that automatically, which is really nice.

The "downside" (if you can call it that) is that adopting knative/pkg for webhooks probably means we should adopt it for the controller framework too, which honestly wouldn't be that bad IMO. I'll try to scope out how much work that is too to get an idea how big an undertaking that would be, and if we want to do it.

qu1queee commented 3 years ago

I think https://coreos.slack.com/archives/C3VS0LV41/p1616609181255300 does not work for me :)

Looking forward for more information on what we think is the best approach for us.