projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.04k stars 1.35k forks source link

Unable to patch network policies after Calico 3.29.0 upgrade #9437

Open kashook opened 3 weeks ago

kashook commented 3 weeks ago

After upgrading to Calico 3.29.0, we noticed that all caliconetworkpolices and globalnetworkpolices were renamed to be prefixed with default. and found the documentation related to the new tier functionality that explains why.

Despite the policy resources being renamed, it does appear that it is generally possible to perform kubernetes API operations against them using the original, non-prefixed names. For example:

 $ kubectl get gnp testpol

NAME              CREATED AT
default.testpol   2024-11-04T20:44:22Z
 $ kubectl delete gnp testpol
globalnetworkpolicy.projectcalico.org "testpol" deleted

However, if you attempt to perform a patch using its original, non-prefixed name, you get an error. For example:

 $ kubectl patch gnp testpol -p '{"metadata":{"labels":{"testlabel":"value"}}}'
Error from server (BadRequest): the name of the object (default.testpol) does not match the name on the URL (testpol)
 $ kubectl label gnp testpol testlabel=value
Error from server (BadRequest): the name of the object (default.testpol) does not match the name on the URL (testpol)

This also affects any helm chart that contains networkpolicy/globalnetworkpolicy resources. An example from one of the helm releases we deploy in our cluster:

Error: UPGRADE FAILED: cannot patch "egress-athens-proxy-http" with kind GlobalNetworkPolicy: the name of the object (default.egress-athens-proxy-http) does not match the name on the URL (egress-athens-proxy-http) && cannot patch "athens-proxy" with kind NetworkPolicy: the name of the object (default.athens-proxy) does not match the name on the URL (athens-proxy)

Because of this, all our helm charts, or anything else that attempts to patch polices are broken until they are updated to include the default. prefix on the name.

Is this intended behavior? I didn't see a breaking change mentioned in the release notes. The fact that you can perform other operations (such as get and delete) without the prefix makes me hope that this is maybe just a bug and that we won't have to modify all of our helm charts to add the prefix on all the network policies.

caseydavenport commented 3 weeks ago

@kashook thanks for raising this - will take a look. I don't think this is intended behavior.

ffilippopoulos commented 2 weeks ago

+1 We've seen the same behaviour and rolled back. I also asked on slack: https://calicousers.slack.com/archives/C0BCA117T/p1730716035332589 and seems that this is a behaviour "promoted" from enterprise versions to open source. The issue for us is that our CD is constantly fighting this back and wants to create resources as named in our committed manifests. Based on the change that introduced this: https://github.com/projectcalico/calico/pull/9166/files it seems like there is no way around it atm, as canonicalizePolicyName func will rename resources before admitting.

caseydavenport commented 2 weeks ago

https://github.com/projectcalico/calico/pull/9166/files it seems like there is no way around it atm, as canonicalizePolicyName func will rename resources before admitting.

canonicalizePolicyName is meant to fix this particular issue. Specifically, without canonicaliePolicyName we'd be seeing this issue for Update requests in addition to Patch requests. That PR even introduced FV tests to verify this.

I think the problem here is that PATCH operations are handled differently from updates, and are running into this checkName function before the Calico storage layer can even call canonicalizePolicyName.