unb9rn commented 2 years ago

Summary

Having some annotation like in flux (fluxcd.io/ignore: "true") would be great. It will allow users to tamper with dev resources in real-time as well as integrate ArgoCD with tools like Okteto.

Motivation

I am trying to set up in-cluster app development with Okteto It substitutes the running deployment to some development one while syncing local code to the remote pod. I don't want to disable self-heal feature altogether, but it would be great to make ArgoCD "forget" about any changes for this period of time while the developer works inside the dev container.

Proposal

Implementing some form of annotation should solve this problem

rosscdh commented 2 years ago

+1 for example.

an app creates a ns
dont want that ns to be deleted if the app gets deleted (as other apps may be installed into that ns)

evoosa commented 2 years ago

bumping

crenshaw-dev commented 2 years ago

Is this equivalent to just disabling auto-sync, or would you also like to disable manual syncs?

dont want that ns to be deleted if the app gets deleted (as other apps may be installed into that ns)

@rosscdh this seems like a different request, basically to prevent pruning.

evoosa commented 2 years ago

yes, I am looking to disable auto-sync - but only to specific resources, not for a whole app. We have an app of apps, so I can't control each app individually anyway. I found a way to ignore a type of resource completely in a whole cluster - by adding this to the argocd-cm configmap:

data:
  resource.exclusions: |
    - apiGroups:
      - "apps"
      kinds:
      - Deployment
      clusters:
      - "<CLUSTER_URL>"

BUT what I need is a way to ignore specific resources independent of their type, and only on demand.

our need comes from using okteto in conjunction with argoCD. flux has an annotation to prevent the sync when okteto is up, but argoCD doesn't have one. take a look at this okteto documentation
the best workaround I found is to add the following annotation to the resource when applying it:
```
annotations:
argocd.argoproj.io/compare-options: IgnoreExtraneous
```
but for some reason it's inconsistent, and okteto keeps crashing since argoCD re-syncs it.

crenshaw-dev commented 2 years ago

Thanks for the explanation!

for some reason it's inconsistent

Are there any downsides to argocd.argoproj.io/compare-options: IgnoreExtraneous besides the fact that it seems to be buggy? If not, would you like to detail that issue, and maybe we can address it as a bug rather than an enhancement?

evoosa commented 2 years ago

hey i'd love to add some details regarding this issue, it persists. it happened as follows:

one of our developers used okteto to create a deployment named control. he added the annotation to his okteto yaml.

suddenly, he got the following error in his okteto:

Follow these steps:
      1. Execute 'okteto down'
      2. Apply your manifest changes again: 'kubectl apply'
      3. Execute 'okteto up' again
More information is available here: https://okteto.com/docs/reference/known-issues/#kubectl-apply-changes-are-undone-by-okteto-up

it's important to mention that the deployment yaml didn't change in git in any branch, and wasn't synced manually at the time.

I looked in the events of the control app in the argoCD UI, and found the following logs:

ScalingReplicaSet
Scaled down replica set control-598448fcc7 to 0
6
1d ago
Yesterday at 11:48 AM
20m ago
Today at 11:40 AM
ScalingReplicaSet
Scaled up replica set control-9c565658c to 1
7
1d ago
Yesterday at 11:48 AM
20m ago
Today at 11:40 AM
ScalingReplicaSet
Scaled up replica set control-598448fcc7 to 1
6
1d ago
Yesterday at 11:48 AM
33m ago
Today at 11:26 AM
ScalingReplicaSet
Scaled down replica set control-bcf7d485c to 0
9
24d ago
06/23/2022
33m ago
Today at 11:26 AM
ScalingReplicaSet
Scaled down replica set control-9c565658c to 0
6
1d ago
Yesterday at 11:48 AM
34m ago
Today at 11:26 AM
ScalingReplicaSet
Scaled up replica set control-bcf7d485c to 1
9
25d ago
06/22/2022
34m ago
Today at 11:26 AM

AKA - the replicaset was scaled up and down for an unknown reason :\

I looked at the cluster's events in the developer's namespace and found the following lines:

28m         Normal    Scheduled                pod/control-9c565658c-tlrkp            Successfully assigned camel/control-9c565658c-tlrkp to ip-192-168-24-93.ec2.internal
28m         Normal    SuccessfulCreate         replicaset/control-9c565658c           Created pod: control-9c565658c-tlrkp
28m         Normal    Pulled                   pod/control-9c565658c-tlrkp            Container image "okteto/bin:1.3.3" already present on machine
28m         Normal    Created                  pod/control-9c565658c-tlrkp            Created container okteto-bin
28m         Normal    Started                  pod/control-9c565658c-tlrkp            Started container okteto-bin
28m         Normal    Created                  pod/control-9c565658c-tlrkp            Created container okteto-init-volume
28m         Normal    Started                  pod/control-9c565658c-tlrkp            Started container okteto-init-volume
28m         Normal    Pulled                   pod/control-9c565658c-tlrkp            Container image "okteto/node:14" already present on machine
28m         Normal    Pulling                  pod/control-9c565658c-tlrkp            Pulling image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
28m         Normal    Pulled                   pod/control-9c565658c-tlrkp            Successfully pulled image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
28m         Normal    Created                  pod/control-9c565658c-tlrkp            Created container app
28m         Normal    Started                  pod/control-9c565658c-tlrkp            Started container app
28m         Normal    Scheduled                pod/control-bcf7d485c-qm52k            Successfully assigned camel/control-bcf7d485c-qm52k to ip-192-168-111-132.ec2.internal
28m         Normal    SuccessfulCreate         replicaset/control-bcf7d485c           Created pod: control-bcf7d485c-qm52k
28m         Normal    Pulling                  pod/control-bcf7d485c-qm52k            Pulling image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
28m         Normal    Pulled                   pod/control-bcf7d485c-qm52k            Successfully pulled image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
28m         Normal    Created                  pod/control-bcf7d485c-qm52k            Created container app
28m         Normal    Started                  pod/control-bcf7d485c-qm52k            Started container app
28m         Normal    Killing                  pod/control-9c565658c-tlrkp            Stopping container app
28m         Normal    SuccessfulDelete         replicaset/control-9c565658c           Deleted pod: control-9c565658c-tlrkp
27m         Normal    Killing                  pod/control-bcf7d485c-qm52k            Stopping container app
27m         Normal    SuccessfulDelete         replicaset/control-bcf7d485c           Deleted pod: control-bcf7d485c-qm52k
27m         Normal    Scheduled                pod/control-598448fcc7-k9tsg           Successfully assigned camel/control-598448fcc7-k9tsg to ip-192-168-24-93.ec2.internal
27m         Normal    SuccessfulCreate         replicaset/control-598448fcc7          Created pod: control-598448fcc7-k9tsg
27m         Normal    SuccessfulAttachVolume   pod/control-598448fcc7-k9tsg           AttachVolume.Attach succeeded for volume "pvc-0a3ef695-92b3-49a3-bdb9-705e9d71c0ef"
27m         Normal    Created                  pod/control-598448fcc7-k9tsg           Created container okteto-bin
27m         Normal    Pulled                   pod/control-598448fcc7-k9tsg           Container image "okteto/bin:1.3.3" already present on machine
27m         Normal    Started                  pod/control-598448fcc7-k9tsg           Started container okteto-bin
27m         Normal    Started                  pod/control-598448fcc7-k9tsg           Started container okteto-init-volume
27m         Normal    Created                  pod/control-598448fcc7-k9tsg           Created container okteto-init-volume
27m         Normal    Pulled                   pod/control-598448fcc7-k9tsg           Container image "okteto/node:14" already present on machine
27m         Normal    Created                  pod/control-598448fcc7-k9tsg           Created container app
27m         Normal    Pulled                   pod/control-598448fcc7-k9tsg           Successfully pulled image "okteto/node:14"
27m         Normal    Started                  pod/control-598448fcc7-k9tsg           Started container app
27m         Normal    Pulling                  pod/control-598448fcc7-k9tsg           Pulling image "okteto/node:14"
14m         Warning   FailedAttachVolume       pod/control-9c565658c-5wxmp            Multi-Attach error for volume "pvc-0a3ef695-92b3-49a3-bdb9-705e9d71c0ef" Volume is already exclusively attached to one node and can't be attached to another
14m         Normal    SuccessfulCreate         replicaset/control-9c565658c           Created pod: control-9c565658c-5wxmp
14m         Normal    Killing                  pod/control-598448fcc7-k9tsg           Stopping container app
14m         Normal    SuccessfulDelete         replicaset/control-598448fcc7          Deleted pod: control-598448fcc7-k9tsg
14m         Normal    Scheduled                pod/control-9c565658c-5wxmp            Successfully assigned camel/control-9c565658c-5wxmp to ip-192-168-8-128.ec2.internal
13m         Normal    SuccessfulAttachVolume   pod/control-9c565658c-5wxmp            AttachVolume.Attach succeeded for volume "pvc-0a3ef695-92b3-49a3-bdb9-705e9d71c0ef"
13m         Normal    Started                  pod/control-9c565658c-5wxmp            Started container okteto-bin
13m         Normal    Created                  pod/control-9c565658c-5wxmp            Created container okteto-bin
13m         Normal    Pulled                   pod/control-9c565658c-5wxmp            Container image "okteto/bin:1.3.3" already present on machine
13m         Normal    Pulled                   pod/control-9c565658c-5wxmp            Container image "okteto/node:14" already present on machine
13m         Normal    Created                  pod/control-9c565658c-5wxmp            Created container okteto-init-volume
13m         Normal    Started                  pod/control-9c565658c-5wxmp            Started container okteto-init-volume
13m         Normal    Pulling                  pod/control-9c565658c-5wxmp            Pulling image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
13m         Normal    Pulled                   pod/control-9c565658c-5wxmp            Successfully pulled image "723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest"
13m         Normal    Started                  pod/control-9c565658c-5wxmp            Started container app
13m         Normal    Created                  pod/control-9c565658c-5wxmp            Created container app

this is the deployment's active manifest AFTER okteto crashed:

apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "49"
kubectl.kubernetes.io/last-applied-configuration: |
  {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"control","namespace":"fox"},"spec":{"selector":{"matchLabels":{"name":"control"}},"template":{"metadata":{"labels":{"name":"control"}},"spec":{"containers":[{"envFrom":[{"configMapRef":{"name":"control"}},{"configMapRef":{"name":"general"}},{"secretRef":{"name":"control"}},{"secretRef":{"name":"jwt"}}],"image":"723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest","imagePullPolicy":"Always","name":"app","ports":[{"containerPort":3000}]}]}}}}
creationTimestamp: "2021-08-01T10:41:23Z"
generation: 148
name: control
namespace: fox
resourceVersion: "130144817"
selfLink: /apis/apps/v1/namespaces/fox/deployments/control
uid: 1b90a4f4-fa95-44f5-9e6d-9e8600e8cd84
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
  name: control
strategy:
rollingUpdate:
  maxSurge: 25%
  maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
  annotations:
    kubectl.kubernetes.io/restartedAt: "2021-08-19T10:18:35+03:00"
  creationTimestamp: null
  labels:
    name: control
spec:
  containers:
  - envFrom:
    - configMapRef:
        name: control
    - configMapRef:
        name: general
    - secretRef:
        name: control
    - secretRef:
        name: jwt
    image: 723128751635.dkr.ecr.us-east-1.amazonaws.com/control:latest
    imagePullPolicy: Always
    name: app
    ports:
    - containerPort: 3000
      protocol: TCP
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  terminationGracePeriodSeconds: 30
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2021-08-01T10:41:23Z"
lastUpdateTime: "2022-02-16T17:18:45Z"
message: ReplicaSet "control-796fc564b5" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2022-03-13T20:35:34Z"
lastUpdateTime: "2022-03-13T20:35:34Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 148
readyReplicas: 1
replicas: 1
updatedReplicas: 1

any idea why would the replicaset scale up and down? we'd really appreciate your help, and love to provide more information if necessary!

vl-kp commented 1 year ago

is there any annotations that disabling auto-sync?

thesuperzapper commented 4 months ago

An explicit argocd.argoproj.io/sync-options: Ignore=true (similar to the Prune=false) would be great, as it would let people annotate a resource that needs to be manually changed (e.g. during an emergency or test), and have argocd not keep updating it.

thesuperzapper commented 4 months ago

@crenshaw-dev Just so we are clear, using argocd.argoproj.io/compare-options: IgnoreExtraneous does not stop ArgoCD from updating the resource, it only stops ArgoCD from saying an application is "out of sync" if that resource exists in the cluster, but not in the application source.

That is to say, for the use case of disabling syncing for a specific resource, argocd.argoproj.io/compare-options: IgnoreExtraneous literally does nothing, because the resource will exist in the source, and so be updated with every sync.

vhurtevent commented 3 months ago

Hello, I'm also interested by this feature :

an annotation on a ressource to temporary disable sync by argocd
potentially annotation on the namespace which disable sync for all ressources in this namespace
as a alternative of an annotation on namespace, a specific/documented configmap in the namespace to disable sync for all ressources inside

Our use case : temporary disable argoCD sync on ressources during a maintenance window, could be :

a scheduled backup for exemple
a manual maintenance by the support team

argoproj / argo-cd

Temporary per-resource sync disable #7975

Summary

Motivation

Proposal