Open snakebyte91 opened 1 year ago
Tigera operator modifies the Installation resource and argo cd will immediately revert the changes
Do you have visibility into what types of modifications are being made?
The operator will default some fields if they are unset and write them back, which is what I suspect is happening.
What version of the operator are you running?
I think I found the problem. I did not add the "/" at the end of the registry. So the tigera-operator changed the value for registry to <account_id>.dkr.ecr.eu-central-1.amazonaws.com/
and Argo changed it back to <account_id>.dkr.ecr.eu-central-1.amazonaws.com
. This happened in an endless loop.
The following values file is working with Argo CD now:
installation:
enabled: true
kubernetesProvider: EKS
registry: <account_id>.dkr.ecr.eu-central-1.amazonaws.com/
Desired Manifest in Argo CD:
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
labels:
argocd.argoproj.io/instance: dev-helm-tigera-operator
name: default
spec:
imagePullSecrets: []
kubernetesProvider: EKS
registry: <account_id>.dkr.ecr.eu-central-1.amazonaws.com/
Live Manifest in Argo CD:
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"operator.tigera.io/v1","kind":"Installation","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"dev-helm-tigera-operator"},"name":"default"},"spec":{"imagePullSecrets":[],"kubernetesProvider":"EKS","registry":"838509663246.dkr.ecr.eu-central-1.amazonaws.com/"}}
creationTimestamp: '2022-12-22T09:11:58Z'
finalizers:
- tigera.io/operator-cleanup
generation: 4043
labels:
argocd.argoproj.io/instance: dev-helm-tigera-operator
name: default
resourceVersion: '81619299'
uid: d90cad43-8a65-48ce-ab13-ba76b8cc4755
spec:
calicoNetwork:
bgp: Disabled
linuxDataplane: Iptables
cni:
ipam:
type: AmazonVPC
type: AmazonVPC
controlPlaneReplicas: 2
flexVolumePath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
imagePullSecrets: []
kubeletVolumePluginPath: /var/lib/kubelet
kubernetesProvider: EKS
nodeUpdateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
nonPrivileged: Disabled
registry: 838509663246.dkr.ecr.eu-central-1.amazonaws.com/
variant: Calico
status:
computed:
calicoNetwork:
bgp: Disabled
linuxDataplane: Iptables
cni:
ipam:
type: AmazonVPC
type: AmazonVPC
controlPlaneReplicas: 2
flexVolumePath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
kubeletVolumePluginPath: /var/lib/kubelet
kubernetesProvider: EKS
nodeUpdateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
nonPrivileged: Disabled
registry: <account_id>.dkr.ecr.eu-central-1.amazonaws.com/
variant: Calico
conditions:
- lastTransitionTime: '2023-01-12T16:18:51Z'
message: All Objects Available
observedGeneration: 4043
reason: AllObjectsAvailable
status: 'False'
type: Degraded
- lastTransitionTime: '2023-01-12T16:18:51Z'
message: All objects available
observedGeneration: 4043
reason: AllObjectsAvailable
status: 'True'
type: Ready
- lastTransitionTime: '2023-01-12T16:18:51Z'
message: All Objects Available
observedGeneration: 4043
reason: AllObjectsAvailable
status: 'False'
type: Progressing
mtu: 9001
variant: Calico
I think the tigera-operator adds attributes like spec.calicoNetwork and spec.cni
Helm chart version 3.24.5 is using the image tigera/operator:v1.28.5
So the tigera-operator changed the value for registry to
.dkr.ecr.eu-central-1.amazonaws.com/ and Argo changed it back to .dkr.ecr.eu-central-1.amazonaws.com. This happened in an endless loop.
This isn't the first time I've seen this - we should probably adjust the operator logic so that we don't write the /
back to that field.
We still need to add it internally if needed, but the problem is that we try to "correct" the input config, causing this loop.
Hi ! I have got the same issue here. I didn't mention the final "/" in my values file for a calico deployment using helm. The exact issue is Calico-apiserver pods (only them) were endlessly terminating and re-starting. Half of the time, they were using the right image path and the other half of the time were trying to use an image from my registry with a path without the final "/". The necessity of adding the final "/" in the values.yaml file is not at all mention in the documentation (eg: https://github.com/projectcalico/calico/blob/master/charts/tigera-operator/values.yaml#L45). At least while this issue stays open and unfixed it could be nice to mention it somewhere in the documentation to avoid having to debug it.
@Mohsen51 The "/" requirement is mentioned in the Calico docs, albeit somewhat buried in the API documentation.
I agree though, this rather unique requirement should be added as a comment to the values.yaml file.
Expected Behavior
Helm chart resources are not modified after deployment.
Current Behavior
Tigera operator modifies the Installation resource and argo cd will immediately revert the changes. This results in a not healthy apiservice.
Steps to Reproduce (for bugs)
installation.registry
Context
Use calico images from a private registry
Your Environment
Helm chart version: 3.24.5 Kubernetes version: 1.22