kubernetes-sigs / cli-utils

This repo contains binaries that built from libraries in cli-runtime.
Apache License 2.0
156 stars 78 forks source link

feat: Add ValidationPolicy & ValidationEvent #488

Closed karlkfi closed 2 years ago

karlkfi commented 2 years ago

fixes: https://github.com/kubernetes-sigs/cli-utils/issues/484

karlkfi commented 2 years ago

@seans3 The current existing behavior already exits early if the Validator errors. I don't understand why you think that's misguided. The code is littered with hundreds of places that assume that minimal validation has been done. It's not going to be a small change to fix that to skip invalid objects (which seems to be what you want).

What this PR primarily does is adds the ability to skip those invalid objects.

I also added more explicit depends-on & apply-time-mutation errors, but that's because these are client-side annotations and the server won't validate them. So if they're invalid and the applier continues without skipping them (the current behavior), then that's behaviorally incorrect. We should never completely ignore invalid depends-on & apply-time-mutation errors. We should either skip those invalid objects or exit early. Either way, the user needs to be notified. That's why I added the ValidationError, to make sure the user always gets notified, regardless of whether they chose SkipInvalid or ExitEarly.

Now that I've said that tho, I think the ApplyAll policy is probably only useful when not using depends-on or apply-time-mutation. Otherwise the behavior when they're invalid would be obviously incorrect. ApplyAll would basically be opting out of client-side validation and ignoring errors in client-side features. That's basically what you're championing when you're saying the applier should be more like kubectl and do less validation. The big difference is that kubectl doesn't have any client-side-only features like depends-on or apply-time-mutation.

karlkfi commented 2 years ago

I've made significant changes to the handling of dependency validation. Please re-review.

I still have more tests to add.

karlkfi commented 2 years ago

Extracted replacement of -OrDie (panic) functions to https://github.com/kubernetes-sigs/cli-utils/pull/505

This should vastly reduce the number of files changed in this PR, once rebased.

karlkfi commented 2 years ago

This has been rebased on top of https://github.com/kubernetes-sigs/cli-utils/pull/505 and https://github.com/kubernetes-sigs/cli-utils/pull/506

karlkfi commented 2 years ago

Extracted the MultiError changes to https://github.com/kubernetes-sigs/cli-utils/pull/508

karlkfi commented 2 years ago

Ok, I did a big refactor to move the sorting bakc into the Solver. And got the tests finally passing again.

TODO:

karlkfi commented 2 years ago

/hold

karlkfi commented 2 years ago

/unhold

All TODOs finished. Ready for another review pass.

mortent commented 2 years ago

I'll remove the approval just to make sure @seans3 also has a chance to look at it before it merges. /approve cancel

karlkfi commented 2 years ago

I do not believe exiting early should be the default behavior. Even if this behavior exists now, it is incorrect and should be fixed to continue on error. Exiting early is also not consistent with the kubectl apply behavior which continues on error.

It's likely that ConfigSync and kpt will both initially use ExitEarly when adopting this new code. They may later switch to either hard coding SkipInvalid OR piping the option through to the user. Does that affect your default suggestion @seans3 ?

karlkfi commented 2 years ago

FWIW, I do not consider ExitEarly to a be a bug. Its a feature which arises from the desire to group related changes. Of course, Kubernetes doesn't have atomic applies or groups, but you can at least avoid applying a Deployment if its Secret, ServiceAccount, or ConfigMap is invalid. This behavior is highly desired when you take the time to granularly group your objects.

On the other hand, SkipInvalid is attractive when you haven't or don't intend to group your objects. This comes up regularly when multiple teams share a single config repo or delegate config management to a central SRE/DevOps/Ops team.

I'm ok with either being default. I don't think it really matters. There are pros and cons on both sides.

I'm biasing slightly towards reverse compatibility here, but if anyone else wants to chime in with a 3rd vote I don't mind making the change.

mortent commented 2 years ago

I would probably prefer that we keep the current behavior here (so ExitEarly is the default), but I don't feel strongly about it.

haiyanmeng commented 2 years ago

For Config Sync users, the default should be ExitEarly, which is closer to the existing validation behavior in Config Sync.

Supporting the SkipInvalid validation policy would require lots of work from Config Sync. Today, Config Sync validates the configs declared in a Git repo, and only sends the configs to cli-utils to apply after the configs pass all the validations (there are about several dozens of validations in Config Sync).

k8s-ci-robot commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: karlkfi, seans3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/cli-utils/blob/master/OWNERS)~~ [seans3] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment