carvel-dev / kapp

kapp is a simple deployment tool focused on the concept of "Kubernetes application" — a set of resources with the same label
https://carvel.dev/kapp
Apache License 2.0
934 stars 110 forks source link

Noop changes should observe change rules for upserting, or be performed last #1000

Closed OlofKalufs closed 2 months ago

OlofKalufs commented 3 months ago

I have a package that, among other things, include a kapp-controller PackageInstall which looks like this (superfluous information omitted):

apiVersion: packaging.carvel.dev/v1alpha1
kind: PackageInstall
metadata:
  annotations:
    kapp.k14s.io/change-group: ccp.ipbps.cgi.com/contour
    kapp.k14s.io/change-rule.0: upsert after upserting ccp.ipbps.cgi.com/packages
    kapp.k14s.io/change-rule.1: delete before deleting ccp.ipbps.cgi.com/packages
  name: contour
spec:
  packageRef:
    refName: contour.tanzu.vmware.com
    versionSelection:
      constraints: 1.26.1+vmware.1-tkg.1
...

Together with a kapp-controller Package that looks like this (again, superfluous information omitted):

apiVersion: data.packaging.carvel.dev/v1alpha1
kind: Package
metadata:
  annotations:
    kapp.k14s.io/change-group: ccp.ipbps.cgi.com/packages
  name: contour.tanzu.vmware.com.1.26.1+vmware.1-tkg.1
spec:
...

Then I update my package with a new version, that changes the version constraint in the PackageInstall and removes the Package with the old version as it simultaneously adds a new Package with the new version. There are some other changes in the package as well, among them some updated Secrets.

What happened: The change to the PackageInstall happens immediately, and then it fails due to the fact that the new Package isn't available. The new Package won't be installed, since it awaits the Secrets to be deployed first. Then it gets stuck in a loop where the PackageInstall has a noop operation that fails due to the Package not existing, and the Package won't be created since it is in the next step in the dependency graph.

What did you expect: Primarily I would expect the update to the PackageInstall to happen after the Package would be created, since the change rule indicates this. When the problem has already occurred and the PackageInstall is failing, I would expect the noop operation to observe the change rules so it's failure wouldn't stop reconciliation unless it fails after the Packages it is dependent on is ready.

Another workaround would be to handle all noops after the other changes in the change set - something I've implemented in a fork and which solved the problem for me. I'll add the PR for that once this issue is created

Anything else you would like to add: No

Environment:


Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible" 👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

renuy commented 3 months ago

@100mik Please review the PR

praveenrewar commented 3 months ago

The change to the PackageInstall happens immediately, and then it fails due to the fact that the new Package isn't available. The new Package won't be installed, since it awaits the Secrets to be deployed first

@OlofKalufs Could you clarify this please. I am not sure why the PackageInstall change would get applied first as the change rule would come into effect. Could you share the kapp deploy logs from when you first make this change in the pkg and pkgi.

OlofKalufs commented 2 months ago

I'm sorry, I don't have the set-up available at the moment .. But I examined it a bit more, and the reason that his happened was that there were a few Secret-instances that were to be updated but had the update-strategy skip (kapp.k14s.io/update-strategy: skip). This meant that in the first batch of updates it took them (which didn't do anything) and then the PackageInstall. The reason the PackageInstall gets applied first is that the change-rule doesn't apply when it is a noop.

The reason the Secret-instances had update-strategy: skip was that they had items injected by controllers in them. I changed so that the injected items would be preserved with a Config-object rebaseRule instead, and that solved the problem for me because then the Secrets didn't show up as diffs and the Package would be in the first batch.

Sorry that I can't be of more help, but at least the problem is solved for me now with this workaround.

praveenrewar commented 2 months ago

No worries @OlofKalufs! It would have been helpful to be able to see the config to understand why pkgi was applied first. We do have an existing issue in the backlog where the noop operation messes up the waiting order. There are also flags available to apply/wait for as many changes as possible, exit-early-on-apply-error and exit-early-on-wait-error.

I am closing this issue for now as it might have been a duplicate of the issue I mentioned above and we don't have any way of reproducing it.