kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
111.18k stars 39.69k forks source link

Kubectl diff fails when there are immutable fields #118325

Open TheEdgeOfRage opened 1 year ago

TheEdgeOfRage commented 1 year ago

What happened?

I am trying to run a kubectl diff --server-side --force-conflicts against our cluster, but it fails with the following message:

The Job "test-job" is invalid: spec.template: Invalid value: core.PodTemplateSpec{...}: field is immutable

We use flux with the force: true option when applying this change to the cluster, so we don't have a problem with deployments and immutability, but it is a problem when trying to do a diff against the cluster.

What did you expect to happen?

I would expect kubectl diff to have a --force flag just like the apply command to ignore immutable fields during diff operations. I see that there is --force-conflicts, but it doesn't fix this issue

How can we reproduce it (as minimally and precisely as possible)?

Create a simple Job and apply it to the cluster:

---
apiVersion: batch/v1
kind: Job
metadata:
  name: test-job
  namespace: default
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: create-dbs
          image: alpine
          command:
            - echo
            - hello there

Update any field in the template spec, for example change the hello there message.

$ kubectl diff --server-side --force-conflicts -f job.yaml

The Job "test-job" is invalid: spec.template: Invalid value: core.PodTemplateSpec{...}: field is immutable

The ... is the redacted pod spec, it's not relevant here.

Anything else we need to know?

No response

Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"archive", BuildDate:"2023-05-28T13:03:55Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-25T23:29:16Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"} ```

Cloud provider

local kind cluster

OS version

```console # On Linux: $ cat /etc/os-release NAME="Arch Linux" PRETTY_NAME="Arch Linux" ID=arch BUILD_ID=rolling ANSI_COLOR="38;2;23;147;209" HOME_URL="https://archlinux.org/" DOCUMENTATION_URL="https://wiki.archlinux.org/" SUPPORT_URL="https://bbs.archlinux.org/" BUG_REPORT_URL="https://bugs.archlinux.org/" PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/" LOGO=archlinux-logo $ uname -a Linux niccolum 6.3.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 24 May 2023 17:44:00 +0000 x86_64 GNU/Linux ```

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

TheEdgeOfRage commented 1 year ago

/sig API Machinery

k8s-ci-robot commented 1 year ago

@TheEdgeOfRage: The label(s) sig/api, sig/machinery cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes/kubernetes/issues/118325#issuecomment-1568289115): >/sig API Machinery Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
TheEdgeOfRage commented 1 year ago

/sig api-machinery

tamilselvan1102 commented 1 year ago

The existing job need to be deleted, because the template section in the job is immutable or not updatable. so you have 2 following options.

  1. Always create a new job with a unique name, so it leaves the old jobs and creates a new one - every time if you include the version of image would be sensible.
  2. Automatic cleaning of jobs (more info check here) - The job property ttlSecondsAfterFinished allows helps to delete the job automatically after a sepecified period.
TheEdgeOfRage commented 1 year ago

But that would mean that the job would show up in the diff every time. I'm fine with force applying the job on deploy, that's what we're doing anyway. The annoying part is that the diff is failing and we use kubectl diff to print the diff in PRs when we open them.

Generating a new job every time, or deleting the old one would mean that we'd get a large diff of all the same jobs every time and you couldn't tell what actually changed

cici37 commented 1 year ago

/cc @apelisse Would you mind take a look at this one when have time? Thank you :) /triage accepted

apelisse commented 1 year ago

I'm not sure I properly understand the issue. You typically apply with --force so that the job gets deleted and recreated, and you can't find the flag in kubectl diff? So you would want --force in kubectl diff, but you would like to not show the deletion/recreation, is that right?

TheEdgeOfRage commented 1 year ago

That is correct. It doesn't have to be --force for the diff command, it could be something like --ignore-immutable, to make it clear that it's just showing the diff, not the actual action that would be performed.

If it doesn't make sense semantically, I'd be also happy with showing the diff with the jobs being deleted and recreated, but at least let the command succeed. Cause currently, it's completely broken when there's a change in a Job.

apelisse commented 1 year ago

Yeah, we definitely don't recommend using --force I think, but I see where you're coming from. I think in general we want diff to be as close as possible to apply, so I'd be OK adding a force flag there, but we would have to show the full deletion/recreation. We would have to play with the name of the object a little bit to guarantee that diff(1) does trigger a full diff.

TheEdgeOfRage commented 1 year ago

We use flux to apply our manifests to the clusters and they recommend to use the force apply function when updating resources with immutable fields. If we were to expire Jobs and delete them, flux would keep recreating them on every reconciliation loop, so it's not really a feasible option in this case. Generating a new name would be possible, but it would be tedious as we would have to add that functionality to our CI pipeline and it would be cumbersome and add more commits to our git log. I don't see another good option except to use --force when applying

apelisse commented 1 year ago

Sure, I think it's fine to add --force to kubectl diff, but I don't have any capacity to help with that at the moment unfortunately.

aparekh7 commented 1 year ago

@apelisse I can pick this up, but I would require some help since I am new here. Would that be feasible?

apelisse commented 1 year ago

Absolutely, feel free to shoot me up with questions on slack (same handle as my github) or here if needed! Good luck.

aparekh7 commented 1 year ago

/assign

lucasfcnunes commented 1 year ago

@aparekh7 any progress?

k8s-triage-robot commented 3 weeks ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

cici37 commented 2 weeks ago

/triage accepted /cc @seans3