Open crenshaw-dev opened 2 years ago
Local diff/syncing is currently implemented by running repo-server code on the CLI's host machine. It's the user's responsibility to install the necessary CLIs (like Helm or Kustomize).
We shouldn't expect users to configure their local machine to run the same CMPs as those installed on the repo-server. Instead, we should have the CLI send the user's local files to the repo-server (via the Argo CD API), and have the repo-server send back the generated diff.
Implement the necessary CLI/API/repo-server components to perform server-side diffing.
--server-side-generate
flag (defaulted to false
).CLI: Print a warning whenever --local
is used without --server-side-generate
.
WARNING: Client-side local diffing/syncing does not support sidecar CMPs and is DEPRECATED in vX.Y.Z. Switch to using `--server-side-generate`. Server-side local diffing/syncing will be default behavior in vX.Y+1.Z.
--local-include
argument accepting a glob pattern (or maybe regex?) to determine which files are sent to the repo-server. If using globs, accept multiple (and union whatever they produce). Use a reasonable default like [.yaml, .json, etc.] that will exclude likely-unnecessary files (like .go, .java, etc.).CLI: Add a warning before sending files for diffing.
WARNING: The following files will be sent to the Argo CD server for manifest generation.
<files>
Continue (use --skip-files-warning to disable this check)? Y/n
--skip-files-warning
flag in case the user needs a non-interactive diff/sync.API: Add/update a diff/sync endpoint to accept user files (as an archive) and send that to repo-server for manifest generation.
Be careful about what you log. Don't want to inadvertently log a secret that the user accidentally sent with their source files.
repo-server: Add/update a manifest generation endpoint to accept the user's files as an archive. Implement logic to 1) copy the app's current source, 2) unpack the user-provided archive over the source, 3) generate the manifests, and 4) delete the source copy.
Read the archive library docs carefully the ensure we're not introducing security vulnerabilities (like directory traversal) to the repo-server.
Be careful about what you log. Don't want to inadvertently log a secret that the user accidentally sent with their source files.
Noted above, but repeated here:
Not totally fixed. We should do this: https://github.com/argoproj/argo-cd/issues/10936
i'm seeing this error
FATA[0002] rpc error: code = Unknown desc = error receiving manifest file stream: error receiving tgz file: file checksum validation error: calc e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 sent f1b4f206df2ea9a29fa819c5a2c6bdc883cd79433bbf9ad36dd11e600306d4e7
when running argocd app diff jx-preprod --server-side-generate --local "$PWD"
from the directory that contains my helmfile.yaml. my app uses this helmfile CMP
this is the argocd application object
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
creationTimestamp: "2022-10-29T22:04:37Z"
generation: 2847
name: jx-preprod
namespace: argocd
ownerReferences:
- apiVersion: argoproj.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: ApplicationSet
name: dev
uid: 8f4d596e-8ffd-452c-b2d0-f4a5dedbf2e1
resourceVersion: "724952059"
uid: 3ee646fa-67f2-4787-ab73-6e34bdbb1dbc
spec:
destination:
namespace: jx
server: https://kubernetes.default.svc
ignoreDifferences:
- group: admissionregistration.k8s.io
jqPathExpressions:
- .webhooks[].namespaceSelector.matchExpressions[] | select(.key == "control-plane")
kind: MutatingWebhookConfiguration
name: istio-sidecar-injector
project: dev
source:
path: helmfiles/jx
plugin:
env:
- name: HELMFILE_USE_CONTEXT_NAMESPACE
value: "true"
- name: HELM_TEMPLATE_OPTIONS
value: --skip-tests
name: helmfile
repoURL: https://github.com/my-org/jenkins-x-dev
targetRevision: HEAD
syncPolicy:
syncOptions:
- CreateNamespace=true
sourceType: Plugin
summary:
externalURLs:
- https://bucketrepo-jx.dev.my-org.com
- https://dashboard-jx.dev.my-org.com
- https://hook-jx.dev.my-org.com/zzhookzz
- https://lighthouse-jx.dev.my-org.com
images:
- chartmuseum/chartmuseum:v0.12.0
- docker.io/bitnami/external-dns:0.11.1-debian-10-r23
- ghcr.io/jenkins-x-plugins/jx-preview:0.0.225
- ghcr.io/jenkins-x/bucketrepo:0.1.67
- ghcr.io/jenkins-x/jx-boot:3.7.8
- ghcr.io/jenkins-x/jx-build-controller:0.4.7
- ghcr.io/jenkins-x/jx-pipelines-visualizer:1.8.2
- ghcr.io/jenkins-x/jx-slack:0.2.1
- ghcr.io/jenkins-x/lighthouse-foghorn:1.10.6
- ghcr.io/jenkins-x/lighthouse-gc-jobs:1.10.6
- ghcr.io/jenkins-x/lighthouse-keeper:1.10.6
- ghcr.io/jenkins-x/lighthouse-tekton-controller:1.10.6
- ghcr.io/jenkins-x/lighthouse-webhooks:1.10.6
- ghcr.io/jenkins-x/lighthouse-webui-plugin:0.1.7
sync:
comparedTo:
destination:
namespace: jx
server: https://kubernetes.default.svc
source:
path: helmfiles/jx
plugin:
env:
- name: HELMFILE_USE_CONTEXT_NAMESPACE
value: "true"
- name: HELM_TEMPLATE_OPTIONS
value: --skip-tests
name: helmfile
repoURL: https://github.com/my-org/jenkins-x-dev
targetRevision: HEAD
revision: a8f2bfc348431499db186a2f1a3ba95d58926fce
status: OutOfSync
✗ argocd version
argocd: v2.5.1+504da42
BuildDate: 2022-11-01T21:30:52Z
GitCommit: 504da424c2c9bb91d7fb2ebf3ae72162e7a5a5be
GitTreeState: clean
GoVersion: go1.18.7
Compiler: gc
Platform: darwin/amd64
argocd-server: v2.5.0+b895da4
BuildDate: 2022-10-25T14:40:01Z
GitCommit: b895da457791d56f01522796a8c3cd0f583d5d91
GitTreeState: clean
GoVersion: go1.18.7
Compiler: gc
Platform: linux/amd64
Kustomize Version: v4.5.7 2022-08-02T16:35:54Z
Helm Version: v3.10.1+g9f88ccb
Kubectl Version: v0.24.2
Jsonnet Version: v0.18.0
You get that consistently? The checksum check is meant to prevent tampering or data corruption. I'd be really surprised to see it consistently.
Can you run a hard refresh to make sure you don't keep getting a cached error?
i tried
--refresh
to the cli command--local-include helmfile.yaml
same result. the command also fails against a different app
argocd app diff istio-system-preprod --server-side-generate --local "$PWD" --refresh --local-include helmfile.yaml
WARN[0001] spec.plugin.name is set, which means this Application uses a plugin installed in the argocd-cm ConfigMap. Installing plugins via that ConfigMap is deprecated in Argo CD v2.5. Starting in Argo CD v2.6, this Application will fail to sync. Contact your Argo CD admin to make sure an upgrade plan is in place. More info: https://argo-cd.readthedocs.io/en/latest/operator-manual/upgrading/2.4-2.5/
FATA[0005] rpc error: code = Unknown desc = error receiving manifest file stream: error receiving tgz file: file checksum validation error: calc e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 sent c897b354287cb29d7886a775131f3514113e7e1a35e48c87b1ff6b5f09a3ae0e
argocd log logs
argocd-server-86c9cf5874-8p8sw server time="2022-11-04T19:47:13Z" level=info msg="received unary call /cluster.SettingsService/Get" grpc.method=Get grpc.request.content= grpc.service=cluster.SettingsService grpc.start_time="2022-11-04T19:47:13Z" span.kind=server system=grpc
argocd-server-86c9cf5874-8p8sw server time="2022-11-04T19:47:13Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Get grpc.service=cluster.SettingsService grpc.start_time="2022-11-04T19:47:13Z" grpc.time_ms=8.785 span.kind=server system=grpc
argocd-server-86c9cf5874-8p8sw server time="2022-11-04T19:47:13Z" level=info msg="received streaming call /application.ApplicationService/GetManifestsWithFiles" grpc.method=GetManifestsWithFiles grpc.request.content="query:<name:\"istio-system-preprod\" checksum:\"c897b354287cb29d7886a775131f3514113e7e1a35e48c87b1ff6b5f09a3ae0e\" appNamespace:\"\" > " grpc.service=application.ApplicationService grpc.start_time="2022-11-04T19:47:13Z" span.kind=server system=grpc
argocd-server-86c9cf5874-8p8sw server I1104 19:47:14.884336 1 request.go:601] Waited for 1.015823353s due to client-side throttling, not priority and fairness, request: GET:https://172.31.232.1:443/apis/security.istio.io/v1beta1?timeout=32s
argocd-server-86c9cf5874-8p8sw server time="2022-11-04T19:47:15Z" level=error msg="Partial success when performing preferred resource discovery" error="unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request"
argocd-repo-server-545b5785b4-jp2ld repo-server time="2022-11-04T19:47:15Z" level=error msg="finished streaming call with code Unknown" error="error receiving manifest file stream: error receiving tgz file: file checksum validation error: calc e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 sent c897b354287cb29d7886a775131f3514113e7e1a35e48c87b1ff6b5f09a3ae0e" grpc.code=Unknown grpc.method=GenerateManifestWithFiles grpc.service=repository.RepoServerService grpc.start_time="2022-11-04T19:47:15Z" grpc.time_ms=1.423 span.kind=server system=grpc
argocd-server-86c9cf5874-8p8sw server time="2022-11-04T19:47:15Z" level=error msg="finished streaming call with code Unknown" error="rpc error: code = Unknown desc = error receiving manifest file stream: error receiving tgz file: file checksum validation error: calc e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 sent c897b354287cb29d7886a775131f3514113e7e1a35e48c87b1ff6b5f09a3ae0e" grpc.code=Unknown grpc.method=GetManifestsWithFiles grpc.service=application.ApplicationService grpc.start_time="2022-11-04T19:47:13Z" grpc.time_ms=1540.895 span.kind=server system=grpc
is there an example of the directory context and files to include when using --server-side-generate
?
Following the original motivation, what do you think of app manifests
also accepting --server-side-generate
?
Also seeing the "file checksum validation error" consistently when running app diff --local apps/foo --server-side-generate
both locally and on the CI server (ephemeral containers, git clone -> helm dep up -> argocd app diff)
FATA[0004] rpc error: code = Unknown desc = error receiving manifest file stream: error receiving tgz file: file checksum validation error
Seems the crc is not being properly computed or there's a difference in the way it's being generated.
@cilindrox can you check the server-side log and post the checksum values?
I did a deep dive on this error with @joshuasimon-taulia and found that when the API server receives the streamed file it receives metadata but no actual file contents. So its checksum is always the checksum of an empty array of bytes.
I still have no idea why that is happening.
Thanks for the quick reply @crenshaw-dev - I'm not seeing any relevant logs on app.kubernetes.io/name=argocd-server
- (checked every other instance, just in case and still no dice).
Here's the crc client-side:
calc e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 sent cc547b59cb862b3f3b389d4cbbed234d27f0282e3de4eab31833fd2bdb7e1ce8
Should I be running this with a debug
log level or similar?
Edit: lack of logs got me thinking - could this be getting filtered at the ingress level? ie: the payload's not reaching the server?
Yep, e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
is the same "calculated" checksum as what @joshuasimon-taulia was getting - zero bytes.
Should I be running this with a debug log level or similar?
Wouldn't hurt, but I don't think there will be much more info. I think the issue is pretty low-level.
could this be getting filtered at the ingress level? ie: the payload's not reaching the server?
Quite possibly. I doubt that the server is receiving the bytes and then just ignoring them. I bet they're not making it over the network.
could this be getting filtered at the ingress level? ie: the payload's not reaching the server?
we tried the same exercise using kubectl port-forward
to bypass nginx ingress, and the results were still the same
I hit the same issue in a scenario where we are using --port-forward
and --plaintext
on the argocd
CLI - with the same exact checksum for the empty array of bytes.
For now, we'll skirt around this issue in our use case (which is to show the diffs on pull requests pre-merge) by using the --revision
flag, but it'd be nice to be able to run it locally, obviously.
Hi!
I also got into the error of checksums not matching today when trying to figure out the diff from local path. I see that the issue referenced in this PR was removed from the v2.6 milestone but the warning about not using --server-side-generate
is still present and it clearly says that it will become the default in 2.6.
Does it mean that this issue will be addressed before v2.6 is released? Since otherwise once v2.6 would be released argocd app diff --local
would be broken.
Having encountered this checksum problem myself, I can contribute to the confusion a bit: in my environment I could narrow this particular problem down to incorrectly setup non-web gRPC ingress. After fixing my setup, I could verify that app diff --local .
works (for various stages of "works", I'm having a different problem after that step) without --grpc-web
and doesn't (EOF
) with --grpc-web
.
For my case it works if if I use --port-forward
and skip the ingress. @rassie how did you fix your ingress? Are you using nginx ingress controller?
@cha7ri I'm on k3d/k3s, so it's a Traefik-based ingress controller. It seem that the Ingresses provided by the Helm chart have been incomplete, so I've bit the bullet and implemented Traefik Ingress as described in the docs.
I'm also getting this issue where the file streaming part of diff --server-side-generate through the grpc-web just doesn't work and the end result is a different checksum.
We cannot use pure grpc as our ArgoCD is behind Cloudflare Access which doesn't support any grpc.
Spreading the knowledge: grpc-web doesn't support client streaming, except by websocket: https://github.com/argoproj/argo-cd/issues/12032#issuecomment-1608190509
Can I find out what the latest is on this?
From what I'm gathering (I'm using traefik as ingress):
ingressroute
instead of using ingress - done--grpc-web
with --local
- done--server-side-generate
- done--local-include "*.*"
to include raw files outside my templates - done--local
and not the chart directory - doneI'm still getting both these errors and haven't been able to figure out a combination that works for server side diffs that includes Kubernetes schema validation!
FATA[0008] failed to send manifest stream file: error sending stream: EOF
or
FATA[0004] rpc error: code = Unknown desc = error receiving manifest file stream: error receiving tgz file: file checksum validation error
Describe the bug
argocd app diff <appname> --local .
fails when<appname>
is handled by a sidecar CMP. Specifically, it fails with this error:argocd app sync <appname> --local .
fails with the same error.To Reproduce
argocd app diff <appname> --local .
from the directory in the git clone where the app is defined.Expected behavior
I would expect a diff to be generated. Since the example CMP always produces the same manifest, I'd expect the diff to be empty.
Version