opendatahub-io / odh-dashboard

Dashboard for ODH
Apache License 2.0
31 stars 168 forks source link

[Bug]: Secure routes cause Liveness and Readiness probes to timeout the dashboard pod #544

Closed maroroman closed 2 years ago

maroroman commented 2 years ago

Is there an existing issue for this?

Current Behavior

Liveness/Readiness probes check the api/config endpoint to check whether the dashboard pod is ready. However the new secure routes require a username which the probes do not have, resulting in a timeout and a crash.

Expected Behavior

Liveness/Readiness probes correctly access the dashboard or use a different method.

Steps To Reproduce

  1. Deploy latest dashboard implementation with a deployment containing probes.
  2. Verify that the dashboard is crashlooping.
  3. Remove probes
  4. Verify that the dashboard is working.

Workaround (if any)

Delete readiness/liveness probes.

OpenShift Infrastructure Version

No response

Openshift Version

No response

What browsers are you seeing the problem on?

No response

Open Data Hub Version

apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
  annotations:
    kfctl.kubeflow.io/force-delete: "false"
  name: odh-dashboard-kfnbc-test
  namespace: opendatahub
spec:
  applications:
  - kustomizeConfig:
    # We just need this for creating the ODH default notebook images that are provided by ODH JupyterHub
    #TODO: Replace with the new ODH default notebook images as part of ODH Core
      overlays:
      - additional
      repoRef:
        name: manifests
        path: jupyterhub/notebook-images
    name: notebook-images
  - kustomizeConfig:
      repoRef:
        name: manifests
        path: odh-notebook-controller
    name: odh-notebook-controller
  - kustomizeConfig:
      overlays:
      - authentication
      # Uncomment the odhdashboard overlay below to have the operator deploy the configs
      # This should be removed from the deployed kfdef immediately after the operator deploys the configs
      # This will prevent the operator from reconciling the configs when they have been modified externally
      #- odhdashboardconfig
      repoRef:
        name: manifests-dashboard  # Use the odh-dashboard repo as the source for this kustomize manifest
        path: manifests
    name: odh-dashboard
  repos:
  - name: manifests
    uri: https://github.com/opendatahub-io/odh-manifests/tarball/master
  - name: manifests-dashboard  # Use the manifests from the odh-dashboard repo
    uri: https://github.com/opendatahub-io/odh-dashboard/tarball/main

Relevant log output

{"level":50,"time":1663063141964,"pid":77,"hostname":"odh-dashboard-77f9dd54b-kpc4l","msg":"Error retrieving username: Error getting Oauth Info for user, missing x-forwarded-access-token header
kpouget commented 2 years ago

was this PR included in the version you tested? https://github.com/opendatahub-io/odh-dashboard/pull/543#event-7371299074


I don't think the actual version of the dashboard is part of what you included in Open Data Hub Version, did I miss something?

I think the best way to tell the dashboard version when deploying from main is to include the image sha, and then go to https://quay.io/repository/opendatahub/odh-dashboard?tab=tags&tag=main, and use the first characters of the sha to lookup the corresponding main-$COMMIT tag

maroroman commented 2 years ago

was this PR included in the version you tested? #543 (comment)

I don't think the actual version of the dashboard is part of what you included in Open Data Hub Version, did I miss something?

I think the best way to tell the dashboard version when deploying from main is to include the image sha, and then go to https://quay.io/repository/opendatahub/odh-dashboard?tab=tags&tag=main, and use the first characters of the sha to lookup the corresponding main-$COMMIT tag

Oh! ok so it is fixed but for some reason the make undeploy/deploy script was changing the probes to the old endpoint not the new health endpoint.

Will close this.

kpouget commented 2 years ago

do you still have access to the dashboard image you used? would be nice to have the SHA to confirm which commit was used

maroroman commented 2 years ago

do you still have access to the dashboard image you used? would be nice to have the SHA to confirm which commit was used

I did not use a specific commit I built an image from latest upstream main, Which should be this: ca3d9a7d2814f87f073bcde413db0e930f906010

Edit: Actually I do have the image on quay: quay.io/mroman_redhat/odh-dashboard:upstream-1