argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.45k stars 4.97k forks source link

UI 2.4.0-rc1 Not able to access web terminal #9335

Closed PatrickZuell closed 2 years ago

PatrickZuell commented 2 years ago

Hello,

after updating to the newest release I tried to test the web terminal. I have found the "exec" button on the Pod but when I click it, there is only a white page and not a starting terminal.

argocdterminalerror

34fathombelow commented 2 years ago

@PatrickZuell Have you enabled this feature? This feature is disabled by default for security considerations. For more information check https://argo-cd.readthedocs.io/en/latest/operator-manual/rbac/#exec-resource

crenshaw-dev commented 2 years ago

Also, there should be no exec button when the feature is disabled. Where did you see the button?

34fathombelow commented 2 years ago

Also, there should be no exec button when the feature is disabled. Where did you see the button?

@crenshaw-dev I can confirm the exec button does show on the pods. I just checked the demo server, any pod you click has the exec button.

avg07 commented 2 years ago

Hello. After exec.enabled in argocd-cm ConfigMap terminal tab is there but with error: WebSocket connection to 'wss...

Screenshot_2

crenshaw-dev commented 2 years ago

@avg07 have you enabled Kubernetes RBAC on the API server to allow exec? The API server functions as a proxy to provide terminal support, so it needs access. https://argo-cd.readthedocs.io/en/latest/operator-manual/rbac/#exec-resource

leoluz commented 2 years ago

@avg07 @crenshaw-dev However this is showing a generic Internal error message. :( It would be great if we were able to distinguish between permission errors vs container errors vs internal errors. By container error I mean a dedicated error message that tells the user that the container doesn't have one of the supported shells. I think we need those error messages to reduce support related to this feature.

ToniIltanen commented 2 years ago

Same here, also with the admin user with all privileges on cluster role.

crenshaw-dev commented 2 years ago

@leoluz agreed, the error message should be much better.

@ToniIltanen even if you as a user have the admin role, that doesn't mean the API server will have pod/exec access. The API server's ServiceAccount must be bound to a role which has pod/exec access to the pod you're trying to exec into. Argo CD execs into the pod on your behalf. Argo CD doesn't use your user permissions to do so.

ToniIltanen commented 2 years ago

@crenshaw-dev it said in the documentation that:

You will also need to add the following to the argocd-api-server Role **or** ClusterRole.

- apiGroups:
  - ""
  resources:
  - pods/exec
  verbs:
  - create

now the cluster role for argocd-admin already has the following:

rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
- nonResourceURLs:
  - '*'
  verbs:
  - '*'

So you might expect that if the cluster role has wild cards, you shouldn't have to create another bindings. I do understand that this is another account, i assumed it was a matter of user privileges on who can even access the terminal feature

Is there any guide on how to add/modify the serviceaccount and how to bind it? the documentation does not help with that

crenshaw-dev commented 2 years ago

@ToniIltanen I think you're looking at the ClusterRole for the controller. The API server's cluster role needs the additional permissions.

https://github.com/argoproj/argo-cd/blob/master/manifests/cluster-rbac/server/argocd-server-clusterrole.yaml

You're right, the documentation isn't the best. I'll put up a PR to clarify.

crenshaw-dev commented 2 years ago

I think those docs are also just wrong the way I wrote them. It will only work for all Pods if you set those permissions on the ClusterRole.

ToniIltanen commented 2 years ago

Thanks @crenshaw-dev, i got it working by modifying the argocd-server clusterRole. If you will update the docs on a "suggested way of configuration", ill change our configs accordingly.

crenshaw-dev commented 2 years ago

@ToniIltanen how do you feel about this clarification? https://github.com/argoproj/argo-cd/pull/9354/files

I hesitate to go into a bunch of detail, 'cause a lot of folks use heavily customized Argo CD installs (e.g. Helm). Seems potentially better to just point out where the change needs to be made and let folks apply the change as they see fit.

ToniIltanen commented 2 years ago

I think its better, but i also suggest to have an complete example.

I think it would be cool if the docs would show how to enable exec in full, example in helm values.yaml

avg07 commented 2 years ago

@crenshaw-dev My argocd-server ClusterRole looks like that: (Screenshot) But i get the same wss error. Maybe I should restart the argocd-server?

Screenshot_1

ToniIltanen commented 2 years ago

@avg07 Do you have websockets enabled in your ingress? so that the websocket connection can enable at all?

avg07 commented 2 years ago

@ToniIltanen Yes. But I also tried directly through NodePort. Same error

ToniIltanen commented 2 years ago

Did you enable the exec on the config map?

avg07 commented 2 years ago

Yes.

Screenshot_3

ToniIltanen commented 2 years ago

Do you use the admin user? Or something else?

If you use other than the admin, you must assign the privileges also. The Admin user has the exec permission but for others, you must assign it to the group

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-rbac-cm
  namespace: argocd
data:
  policy.csv: |
    p, role:foobar, exec, get, *, allow
    g, foo, role:foobar
avg07 commented 2 years ago

I know. I use admin user

Screenshot_4

ToniIltanen commented 2 years ago

Weird,we have almost 100% similar solution except we use nginx ingress ( separated the /terminal location for websocket connections ), and it works as it should. We use keycloak authentication instead the native one.

our ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: argo-ingress
  namespace: argocd
  annotations:
     kubernetes.io/ingress.class: "nginx"
     cert-manager.io/cluster-issuer: "letsencrypt-prod"
     nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
     nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
     nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
     nginx.ingress.kubernetes.io/websocket-services: "argocd-server"
     nginx.ingress.kubernetes.io/server-snippets: |
       location /terminal {
        proxy_set_header Upgrade $http_upgrade;
        proxy_http_version 1.1;
        proxy_set_header X-Forwarded-Host $http_host;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header Host $host;
        proxy_set_header Connection "upgrade";
        proxy_cache_bypass $http_upgrade;
        }
spec:
  tls:
  - hosts:
    - foobar.com
    secretName: argocd-prod-tls
  rules:
  - host: "foobar.com"
    http:
      paths:
      - pathType: Prefix
        path: /
        backend:
          service:
            name: argocd-server
            port:
              number: 443

config map

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
  labels:
    app.kubernetes.io/name: argocd-cm
    app.kubernetes.io/part-of: argocd
data:
  # Argo CD's externally facing base URL (optional). Required when configuring SSO
  url: https://foobar.com

  # Enables application status badge feature
  statusbadge.enabled: "true"

  # Override the Argo CD hostname root URL for both the project and the application status badges.
  # Here is an example of the application status badge for the app `myapp` to see what is replaced.
  #    <statusbadge.url>api/badge?name=myapp&revision=true
  # Provide custom URL to override. You must include the trailing forward slash:
  statusbadge.url: ""

  # Enables anonymous user access. The anonymous users get default role permissions specified argocd-rbac-cm.yaml.
  users.anonymous.enabled: "false"
  # Specifies token expiration duration
  users.session.duration: "24h"

  # Specifies regex expression for password
  passwordPattern: "^.{8,32}$"

  # Enables google analytics tracking is specified
  ga.trackingid: "UA-12345-1"
  # Unless set to 'false' then user ids are hashed before sending to google analytics
  ga.anonymizeusers: "true"

  # OIDC configuration as an alternative to dex (optional).
  oidc.config: |
    name: Keycloak
    issuer: https://foobar.com/realms/master
    clientID: argocd
    clientSecret: $oidc.keycloak.clientSecret
    requestedScopes: ["openid", "profile", "email", "groups"]
  # Per-version build options and binary paths
  kustomize.path.v3.9.1: /custom-tools/kustomize_3_9
  kustomize.buildOptions.v3.9.1: --enable_kyaml true
  kustomize.version.v3.5.1: /custom-tools/kustomize_3_5_1
  application.instanceLabelKey: mycompany.com/appname

  # disables admin user. Admin is enabled by default
  admin.enabled: "true"

  accounts.alice: apiKey, login
  accounts.alice.enabled: "false"
  accounts.github: apiKey
  accounts.github.enabled: "true"
  ui.cssurl: "./custom/my-styles.css"
  timeout.reconciliation: 180s
  exec.enabled: "true"

policy

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-rbac-cm
  namespace: argocd
data:
  policy.csv: |
    p, role:github-admin, projects, get, github-admin, allow
    p, role:github-admin, applications, *, */*, allow
    p, role:admin, exec, get, *, allow
    g, ArgoCDAdmins, role:admin
    g, github, role:github-admin

argocd-server cluster role


apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"server","app.kubernetes.io/name":"argocd-server","app.kubernetes.io/part-of":"argocd"},"name":"argocd-server"},"rules":[{"apiGroups":["*"],"resources":["*"],"verbs":["delete","get","patch"]},{"apiGroups":[""],"resources":["events"],"verbs":["list"]},{"apiGroups":[""],"resources":["pods","pods/log"],"verbs":["get"]}]}
  creationTimestamp: "2022-02-11T08:41:54Z"
  labels:
    app.kubernetes.io/component: server
    app.kubernetes.io/name: argocd-server
    app.kubernetes.io/part-of: argocd
  name: argocd-server
  resourceVersion: "2631766902"
  uid: foobar
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - delete
  - get
  - patch
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - list
- apiGroups:
  - ""
  resources:
  - pods

``
avg07 commented 2 years ago

@ToniIltanen Just in case added your endpoint 'terminal'. But no result. My ingress:

kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/configuration-snippet: |
      set_real_ip_from 103.21.244.0/22;
      set_real_ip_from 103.22.200.0/22;
      set_real_ip_from 103.31.4.0/22;
      set_real_ip_from 104.16.0.0/12;
      set_real_ip_from 108.162.192.0/18;
      set_real_ip_from 131.0.72.0/22;
      set_real_ip_from 141.101.64.0/18;
      set_real_ip_from 162.158.0.0/15;
      set_real_ip_from 172.64.0.0/13;
      set_real_ip_from 173.245.48.0/20;
      set_real_ip_from 188.114.96.0/20;
      set_real_ip_from 190.93.240.0/20;
      set_real_ip_from 197.234.240.0/22;
      set_real_ip_from 198.41.128.0/17;
      set_real_ip_from 2400:cb00::/32;
      set_real_ip_from 2606:4700::/32;
      set_real_ip_from 2803:f800::/32;
      set_real_ip_from 2405:b500::/32;
      set_real_ip_from 2405:8100::/32;
      set_real_ip_from 2c0f:f248::/32;
      set_real_ip_from 2a06:98c0::/29;
      real_ip_header CF-Connecting-IP;
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
    nginx.ingress.kubernetes.io/server-snippets: |
      location /terminal {
       proxy_set_header Upgrade $http_upgrade;
       proxy_http_version 1.1;
       proxy_set_header X-Forwarded-Host $http_host;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Forwarded-For $remote_addr;
       proxy_set_header Host $host;
       proxy_set_header Connection "upgrade";
       proxy_cache_bypass $http_upgrade;
       }
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
  name: argocd-dev-ingress
  namespace: argocd
spec:
  rules:
  - host: host.com
    http:
      paths:
      - backend:
          serviceName: argocd-server
          servicePort: http
        path: /
  tls:
  - hosts:
    - host.com
    secretName: tls-secret

P.S. in CF WebSockets enableded. Thanks for the help

ToniIltanen commented 2 years ago

add this line also to the annotations, if it helps:

nginx.ingress.kubernetes.io/websocket-services: "argocd-server"
avg07 commented 2 years ago

No. But the problem is not in Ingress. Here is the error through NodePort:

Screenshot_5

ToniIltanen commented 2 years ago

@crenshaw-dev does the websockets work with ws:// or does the connection require wss://?

avg07 commented 2 years ago

I can see in your url that you have http://192.168.97.15:30454, but the websocket connection is to different port. (ws://192.168.97.15:3045)

No Screenshot_6

ToniIltanen commented 2 years ago

Yeah i saw that the number 4 was on another line so i missed it, thats why i removed the comment. I havent tried with ws://, i asked @crenshaw-dev if theres any restrictions on that

avg07 commented 2 years ago

Maybe problem is that my server is running with the parameter "--insecure"? Screenshot_7

ToniIltanen commented 2 years ago

could be, i dont know if the --insecure parameter affects the RBAC rules

crenshaw-dev commented 2 years ago

Hm. I noticed this:

    p, role:foobar, exec, get, *, allow

I think in the RC, exec should use create.

EDIT: ha, that's probably because the docs are still wrong. Thanks @34fathombelow for the catch! https://github.com/argoproj/argo-cd/pull/9354/files

@crenshaw-dev does the websockets work with ws:// or does the connection require wss://?

The frontend code toggles ws vs. wss based on whether the UI is using http or https. I don't know much about how the backend support works. But it at least looks like it was intended to handle either.

crenshaw-dev commented 2 years ago

I think it would be cool if the docs would show how to enable exec in full, example in helm values.yaml

Fair. The community maintains the Argo CD Helm chart. After they release a chart for 2.4 I can add instructions for setting the enable-terminal value.

PatrickZuell commented 2 years ago

Hi, thanks for all the responses. I am also using the helm release and I have updated everything.

now I am able to see the terminal in the pods, but receiving the same error as @avg07 with Websockets. I tried it with the following ingress annotations:

          nginx.ingress.kubernetes.io/websocket-services: argocd-server
          nginx.org/websocket-services: argocd-server

But that did not help

PatrickZuell commented 2 years ago

Hm. I noticed this:

    p, role:foobar, exec, get, *, allow

I think in the RC, exec should use create.

After changing that, it works, Thanks!

crenshaw-dev commented 2 years ago

@PatrickZuell awesome, glad y'all caught the docs issue. :-)

crenshaw-dev commented 2 years ago

@avg07 I take it you still have the websocket issue though?

avg07 commented 2 years ago

@crenshaw-dev Yes, I still have same error

Upd: Solved. Problem with "--insecure" flag

crenshaw-dev commented 2 years ago

Closing, since it looks like everyone's up and running. lmk if I need to reopen. :-)

crenshaw-dev commented 2 years ago

If anyone encounters the websocket issue and is unable to disable --insecure to fix it, please comment, and I'll reopen. We definitely want to support that use case.

leoluz commented 2 years ago

@crenshaw-dev I believe --insecure is a required flag for users exposing ArgoCD behind an Ingress. It might be a feature limitation for users running with this setup.

tooptoop4 commented 2 years ago

@crenshaw-dev can you re-open? i want --insecure behind Ingress. facing 'terminal connection error' on 2.4.0-rc3 with ALB, more details in https://github.com/argoproj/argo-cd/issues/9550

avg07 commented 2 years ago

@crenshaw-dev After upgrade to [2.4.0-rc3] I have same problem with websocket again. But this time without "--insecure" parameter.

Update: Fixed in [2.4.0-rc4]

tooptoop4 commented 2 years ago

i confirm works on 2.4.0-rc4

alexmt commented 2 years ago

Thank you for confirming! It was painful to debug because error messages are not shown anywhere in the UI. Creating another ticket to improve it.

jontambi commented 1 year ago

Hi there. First thanks for all conversation, helps me to get Terminal in ArgoCD. A part of that, someone have faced accessing to terminal for Ruby Alpine containers? When I hit button TERMINAL console appears but then I try to run some in the console, nothing happens and the following errors is shown in argocd-server POD Version: ArgoCD Release [v2.4.0-rc4]

This is the error time="2022-06-09T16:16:14Z" level=info msg="terminal session starting" application=some-service cluster=services-project container=some-service namespace=somenamespace podName=somepod-name-service- userName= E0609 16:16:14.680803 1 v2.go:105] EOF time="2022-06-09T16:16:16Z" level=info msg="Alloc=38243 TotalAlloc=6130673 Sys=81748 NumGC=371 Goroutines=203" time="2022-06-09T16:16:34Z" level=error msg="read message err: websocket: unexpected reserved bits 0x60" E0609 16:16:34.593766 1 v2.go:105] websocket: unexpected reserved bits 0x60

crenshaw-dev commented 1 year ago

@jontambi can you provide a specific image tag we could test against?

jontambi commented 1 year ago

@crenshaw-dev Hi, I have tried with the following image tags:

djfinnoy commented 1 year ago

Encountered this problem, and discovered something important:

When you get the RBAC settings correct, you are only able to use UI exec on containers where bash is installed. Containers with sh will give you weird issues with websockets, and terminals where you are not allowed to type any input. I'm using the --insecure flag, so I doubt that's the cause of the issue.

12345ieee commented 1 year ago

Hello, I'm using v2.4.2 out of the latest version of the official helm chart (4.9.7). We're hitting the same Terminal Connection Error above, due to the GET wss://MY_URL..... NOT FOUND.

We use a few custom args, including the infamous --insecure:

  extraArgs:
    - --basehref
    - /argo-cd
    - --rootpath
    - /argo-cd
    - --insecure

to expose behind a nearly vanilla Ingress the helm chart deploys for us, which I show:

kind: Ingress
metadata:
  name: argocd-server
  labels:
    helm.sh/chart: argo-cd-4.9.7
    app.kubernetes.io/name: argocd-server
    app.kubernetes.io/instance: release-name
    app.kubernetes.io/component: server
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: argocd
spec:
  rules:
    - host: MY_URL
      http:
        paths:
          - path: /argo-cd
            pathType: Prefix
            backend:
              service:
                name: argocd-server
                port:
                  number: 80
  tls:
    - hosts:
      - MY_URL

The RBACs and ClusterRole are right (thanks to the amazing folks at the helm repo it was all automatically done).

Do I need to add some magic annotation to the Ingress here? We use Nginx as ingress controller so I'm happy even with nginx-only annotations.

12345ieee commented 1 year ago

Sorry for double posting:

Considering that the wss request is sent to: wss://MY_URL/terminal?pod=argocd-server... instead of wss://MY_URL/argo-cd/terminal?pod=argocd-server...

I assume the wss functionality is not respecting the custom root path. Where should I configure it?