jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.56k stars 799 forks source link

403 Forbidden XSRF cookie does not match POST argument after updating to the latest helm chart version (3.3.7) #3422

Open matanshk opened 6 months ago

matanshk commented 6 months ago

Bug description

We are using z2jh helm chart on our Kubernetes cluster, we upgraded the chart to the latest version (3.3.7) from 3.1.0. When the upgrade was finished; we started to get the error in the UI: "403 Forbidden, XSRF cookie does not match POST argument" We noticed weird behavior from the jupyterhub, because some people in the team always got the issue, some of them were facing it sometimes (not always), and some didn't face it at all. I want to mention that it happens only with Chrome and Firefox browsers, but with Safari it worked well. Cleaning cookies and incognito didn't solve it, we also tried to update the browser's version to the newest and nothing changed.

Screenshot 2024-05-22 at 4 18 26 PM

I want to mention that before the upgrade we never saw this issue, I tried to downgrade the helm chart version for the previous patches (3.3.6, 3.3.5, 3.3.4, 3.3.3) and still got the same 403 error when I downgraded it to 3.1.0 (our previous version before the upgrade) the issue disappears.

In the logs I can see that:

How to reproduce

Acutely, we tried our best to understand how to reproduce the issue and make it cause in the team members that are not facing with the issue, but without any success :| but I can say that it happens in the authentication step, it's doesn't matter if you provide correct username and password or wrong, you will get the 403 error.

Expected behaviour

To get a smooth authentication process without getting the 403 Forbidden error

Actual behaviour

We are getting 403 error right after clicking on the "Sign in" button

Your personal set up

We are running on LKE cluster with Debian 11 OS worker nodes. Nginx ingress controller and mTLS certificate for authentication on the ingress (I disabled the mTLS for testing and nothing changed) together with dummy authenticator with preconfigured password The issue happens right after the upgrade to helm chart version 3.3.7 from 3.1.0.

Configuration ``` singleuser: events: false networkPolicy: enabled: false storage: type: dynamic extraLabels: {} extraVolumes: - name: sparkmagic-config configMap: name: sparkmagic-config extraVolumeMounts: - name: sparkmagic-config mountPath: /opt/.sparkmagic/config.json subPath: config.json static: pvcName: subPath: "{username}" capacity: 10Gi homeMountPath: /home/jovyan dynamic: storageClass: pvcNameTemplate: claim-{username}{servername} volumeNameTemplate: volume-{username}{servername} storageAccessModes: [ReadWriteOnce] extraEnv: SPARKMAGIC_CONF_DIR: /opt/.sparkmagic/ SPARKMAGIC_CONF_FILE: config.json image: name: tag: pullPolicy: Always pullSecrets: [ "acr-docker-auth" ] startTimeout: 300 cmd: "/opt/entrypoint.sh" proxy: service: type: ClusterIP chp: networkPolicy: enabled: false hub: existingSecret: jupyterhub-secret-conf networkPolicy: enabled: false config: Authenticator: admin_users: - user1 - user2 - user3 allowed_users: - user4 - user5 JupyterHub: authenticator_class: dummy authenticatePrometheus: true extraEnv: - name: PROMETHEUS_TOKEN valueFrom: secretKeyRef: name: prometheus-service-token key: PROMETHEUS_TOKEN extraConfig: prometheus-service.py: | # Add a service "promehteus-service" to scrape prometheus metrics c.JupyterHub.services = [ { "name": "prometheus-service", "api_token": os.environ["PROMETHEUS_TOKEN"] }, ] # Add a service role to scrape prometheus metrics c.JupyterHub.load_roles = [ { "name": "service-metrics-role", "description": "access metrics", "scopes": [ "read:metrics", ], "services": [ "prometheus-service", ], } ] ingress: enabled: true annotations: kubernetes.io/ingress.class: "nginx" cert-manager.io/cluster-issuer: letsencrypt-production nginx.ingress.kubernetes.io/auth-tls-error-page: "http://www.mysite.com/error-cert.html" nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true" nginx.ingress.kubernetes.io/auth-tls-secret: "jupyterhub/ca-secret" nginx.ingress.kubernetes.io/auth-tls-verify-client: "on" nginx.ingress.kubernetes.io/auth-tls-verify-depth: "2" ingressClassName: pathSuffix: pathType: Prefix hosts: - jupyterhub.example.host.net tls: - hosts: - jupyterhub.example.host.net secretName: jupyterhub-production-tls ```
Logs ``` [D 2024-05-22 10:37:37.991 JupyterHub _xsrf_utils:155] xsrf id mismatch b'None:K_exHeY0CyJABPsBIDe7n6UIv1_upqmXywnhbOr9FIQ=' != b'None:TC8vH45MqUauWHsXz0zEsrVDFQ-Hzg0Zv3mZzYFnjls=' [I 2024-05-22 10:37:37.992 JupyterHub _xsrf_utils:125] Setting new xsrf cookie for b'None:TC8vH45MqUauWHsXz0zEsrVDFQ-Hzg0Zv3mZzYFnjls=' {'path': '/hub/', 'max_age': 3600} [W 2024-05-22 10:37:37.992 JupyterHub web:1873] 403 POST /hub/login?next=%2Fhub%2F (10.2.13.129): XSRF cookie does not match POST argument ```
welcome[bot] commented 6 months ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

samyuh commented 6 months ago

Hello! I'm also with this problem. I have a custom Load Balancer service pointing to the proxy, which is defined as ClusterIP:

proxy:
  service:
      type: ClusterIP

I tried disable xsrf check, but without success:

extraConfig:
    myConfigName: |
      c.ServerApp.disable_check_xsrf = True
      c.JupyterHub.disable_check_xsrf = True
      print("Disabled XSRF check", flush=True)

Am I doing something wrong to disable this xsrf check?

matanshk commented 6 months ago

@samyuh, If I'm not wrong; they removed the option to configure the XSRF cookie when they released Jupytherhub version 4.0.0.

samyuh commented 6 months ago

Oh, thanks for the information.

By the way, I double checked and we are in fact using the version 3.3.7. I will try to downgrade later today and I will reach out if the bug persists or not.

samyuh commented 6 months ago

@matanshk after the downgrade to 3.1.0 we are able to login

matanshk commented 6 months ago

@samyuh I'm happy to hear, and this is exactly what happened to us

jdicesar commented 5 months ago

Hey guys I have been fighting this on my server build as well. The downgrade to 3.1.0 also worked for me. I am using Docker Swarm as opposed to Kubernetes. Did you guys have any issues getting the singleuser servers to run after the downgrade? I matched the version for that image to jupyterhub/singleuser:3.1.0.

matanshk commented 5 months ago

@jdicesar We hadn't issue with the single user server after the downgrade. I just want to mention that we are building our single user server image based on the juoyter base image

Khoi16 commented 5 months ago

Is there any one fix this bug

Khoi16 commented 5 months ago

So finally, I found that if we use the DNS (domain) that has a proxy server like cloud flare. We will have an error like this. Anyone can explain for me please!. Thanks

Khoi16 commented 5 months ago

Oh, I found the solution. Add proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; to your z2jh

ingress:
  enabled: true
  # annotations: {}
  annotations:
    nginx.org/websocket-services: proxy-public
    nginx.org/server-snippets: |
      server_name asdasdasdasdsaa_test;
      location / {
      proxy_pass http://localhost:8888;
      proxy_set_header Host $host;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection upgrade;
      proxy_set_header Accept-Encoding gzip;
      }
Khoi16 commented 5 months ago

So, have you solved this problem ? I think that came from your proxy between ingress and domain. You can check the issue by using local domain (manually set domain in hosts).

Another reason, try to set full like my comment (header Host, proxy_pass,…). You can set server_name by your domain

Vào Thứ Năm, 20 tháng 6, 2024, matanshk @.***> đã viết:

@Khoi16 https://github.com/Khoi16 I added the annotation, but didn't solved the issue.. can you please help me to understand what's wrong here?

ingress: enabled: true annotations: kubernetes.io/ingress.class: "nginx" cert-manager.io/cluster-issuer: letsencrypt-production nginx.ingress.kubernetes.io/auth-tls-error-page: "http://www.mysite.com/error-cert.html" nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true" nginx.ingress.kubernetes.io/auth-tls-secret: "jupyterhub/ca-secret" nginx.ingress.kubernetes.io/auth-tls-verify-client: "on" nginx.ingress.kubernetes.io/auth-tls-verify-depth: "2" nginx.ingress.kubernetes.io/server-snippet: | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

— Reply to this email directly, view it on GitHub https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/3422#issuecomment-2180825621, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANGDHMLTPHOGWZAAAR2QHDDZILP7RAVCNFSM6AAAAABIFTLXJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBQHAZDKNRSGE . You are receiving this because you were mentioned.Message ID: @.***>

chainlink commented 3 months ago

Also hitting this issue in our jupyterhub install.

ScOut3R commented 2 months ago

Running JupyterHub on GKE with the Gateway API to expose the web ui I encountered this issue. For version 4.x the solution was to set X-Forwarded-Host on the HTTPRoute to the public facing host.

samyuh commented 2 months ago

I will try to work on this once I have some free time. Was someone able to reproduce this locally?

I could just reproduce this when running on our preprod servers, and I don't want to debug things there

derekelewis commented 2 months ago

I encountered this issue when using Z2JH on EKS and an ALB as the ingress. Enabling sticky sessions fixed it for me.

Richard-Regan commented 1 month ago

Also hitting this issue in our jupyterhub install.

Hi Chainlink, did you ever solve this, as I am getting the same issue on a clean install.

matanshk commented 2 weeks ago

any update about this issue?

matanshk commented 1 week ago

Hi everyone, the issue was resolved by adding this annotation to the ingress:

    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header X-Real-IP $remote_addr;
jrdnbradford commented 18 hours ago

@matanshk what's your full set of ingress annotations? I added this and still got the error.