argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.6k stars 5.36k forks source link

argocd app diff, exit code is always the same #3588

Open pere3 opened 4 years ago

pere3 commented 4 years ago

Checklist:

Describe the bug

We are trying to use argocd diff output to determine situations, where next deployment will actually delete something inside out k8s cluster, rather than update or add.

When using argocd app diff command, exit code is always the same for actual "found diff" and "Failed to establish connection" cases.

To Reproduce

$ argocd --auth-token ${AUTH_TOKEN} --server argocd.cloud.test.org:80 --plaintext app diff python-mcs-test --local ./
INFO[0001] helm template . --name-template python-mcs-test --namespace microservices --values values.yaml --include-crds  dir=./ execID=IRBoN
===== apps/Deployment microservices/python-mcs-test-test-aiocron ======
10c10
<     test.com/image-tag: destructive-change-gke-20200514141114
---
>     test.com/image-tag: destructive-change-gke-20200514122504
57c57
<         image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514141114
---
>         image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514122504
===== apps/Deployment microservices/python-mcs-test ======
10c10
<     test.com/image-tag: destructive-change-gke-20200514141114
---
>     test.com/image-tag: destructive-change-gke-20200514122504
60c60
<         image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514141114
---
>         image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514122504
===== batch/CronJob microservices/python-mcs-test-test-cron1 ======
10c10
<     test.com/image-tag: destructive-change-gke-20200514141114
---
>     test.com/image-tag: destructive-change-gke-20200514122504
46c46
<             image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514141114
---
>             image: gcr.io/test-cloud-48483/python-mcs-test:destructive-change-gke-20200514122504
$ echo $?
1

$ argocd --auth-token ${AUTH_TOKEN} --server argocd.cloud.test.org:81 --plaintext app diff python-mcs-test --local ./
FATA[0020] Failed to establish connection to argocd.cloud.test.org:81: dial tcp 10.166.10.3:81: operation was canceled
$ echo $?
1

Expected behavior

We expect that argocd will throw different exit signals on failures (connection error), or at least an ability to correct it's behaviour with flags and parameters.

Same exit code values makes it impossible to distinguish actual command result inside jenkins pipeline.

Version

argocd: v1.5.1+8a3b36b
  BuildDate: 2020-04-06T15:59:42Z
  GitCommit: 8a3b36bd28fe278f7407b9a8797c79ad859d1acd
  GitTreeState: clean
  GoVersion: go1.14
  Compiler: gc
  Platform: darwin/amd64
argocd-server: v1.5.4+36bade7
  BuildDate: 2020-05-05T19:01:57Z
  GitCommit: 36bade7a2d7b69d1c0b0c4d41191f792a847d61c
  GitTreeState: clean
  GoVersion: go1.14.1
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: {Version:kustomize/v3.5.4 GitCommit:3af514fa9f85430f0c1557c4a0291e62112ab026 BuildDate:2020-01-11T03:12:59Z GoOs:linux GoArch:amd64}
  Helm Version: version.BuildInfo{Version:"v3.2.0", GitCommit:"e11b7ce3b12db2941e90399e874513fbd24bcb71", GitTreeState:"clean", GoVersion:"go1.13.10"}
  Kubectl Version: v1.14.0
jannfis commented 4 years ago

That probably applies to most of the CLI commands. But I agree with you, we should have different (documented) exit codes for different exit scenarios, and thus making automation using the CLI much more practical.

pere3 commented 4 years ago

Thanks, we also believe that valid diff output shouldn't be followed with a non-zero exit code, like it happens in helm-diff plugin (https://github.com/databus23/helm-diff).

Because diffs are usually a normal thing, but non-zero exit code is not. And it is an CI/CD job to determine this diff impact and validity.

jannfis commented 4 years ago

It's hard to tell what exit code is right. Following semantics of classical UNIX diff tool, the exit code is 1as well when the compared data differs. However, git diff returns exit code 0 regardless of differences or not.

Personally, I believe for automation it might be beneficial if different exit codes are used on different results of the operation, i.e. whether there have been differences or not. This allows for simple conditionals in shell, such as

if diff foo bar; then
  # there were no differences
else
  # there were differences if $? is 1, otherwise an error
fi

I think we should not change the semantics of the exit code for when there is a difference or not. But I agree, that we should use other exit codes for failures, such as API server not reachable, application does not exist, etc.

pere3 commented 4 years ago

Thanks a lot, i kinda forgot how unix diff operates - it really gives you a 1, if there is any difference between files (it works with binaries too!).

Will wait for other exit codes on failures implementation, any approx. ETA on that?

jannfis commented 4 years ago

Sorry, can't give you an ETA. The issue queue is quite full at the moment, but maybe a volunteer will step forward? :)

spirosoik commented 3 years ago

I can give it a try

ameytotawar commented 1 year ago

I am working on this issue.

OmerKahani commented 1 year ago

Looking at the code, in most cases an error will lead to exit code 20

But there are two commands that will use exit code 1 inside the command: headless.NewClientOrDie & clientset.NewApplicationClientOrDie

The commands uses log.Fatal & log.Fatalf which will use exit code 1 As both of the commands are widely used, changing them will result in a breaking change. The other option is to change the exit code used to signal there is a diff, which currently is set to 1, but that is also a breaking change :(

eugene70 commented 1 week ago

I analyzed this issue. I think it would be better to allow specifying a code as an option rather than returning a fixed code. There is also a similar option called --exit-code, which is a boolean value that returns an error code of 1 if true, and 0 if false. The default is true.

if foundDiffs && exitCode {
  os.Exit(1)
}

For compatibility reasons, this option will probably have to remain. I think a new option would be something like --diff-exit-code or --exit-code-when-diff.