nebari-dev / nebari

🪴 Nebari - your open source data science platform
https://nebari.dev
BSD 3-Clause "New" or "Revised" License
275 stars 89 forks source link

[BUG] - module.kubernetes-keycloak-helm.helm_release.keycloak timed out waiting for the condition on local development #2076

Open chrislevn opened 11 months ago

chrislevn commented 11 months ago

Describe the bug

I'm trying to test deploy Nebari on local (run on Kubernetes - profile docker-desktop). The keycloak stopped due to time out after 5ms.

[terraform]: ╷
[terraform]: │ Error: timed out waiting for the condition
[terraform]: │ 
[terraform]: │   with module.kubernetes-keycloak-helm.helm_release.keycloak,
[terraform]: │   on modules/kubernetes/keycloak-helm/main.tf line 1, in resource "helm_release" "keycloak":
[terraform]: │    1: resource "helm_release" "keycloak" {
[terraform]: │ 
[terraform]: ╵

Result of kubectl get pods -n <namespace>:

NAME                                      READY   STATUS    RESTARTS   AGE
nebari-traefik-ingress-7dbfdc5dd5-5hfkq   1/1     Running   0          18h

Result of kubectl logs nebari-traefik-ingress-7dbfdc5dd5-5hfkq -n <namespace>

time="2023-10-17T20:38:15Z" level=info msg="Configuration loaded from flags."
time="2023-10-17T20:38:15Z" level=error msg="Secret <same with namespace>/ does not exist" namespace=<namespace> providerName=kubernetescrd TLSStore=default

My questions are:

Expected behavior

Pass all the deployment stages with an accessible URL/IP Address at the end

OS and architecture in which you are running Nebari

MacOS M1

How to Reproduce the problem?

nebari-config.yaml

project_name: <project_name>
provider: existing
domain: <domain>
certificate:
  type: self-signed
security:
  authentication:
    type: GitHub
    config:
      client_id: <github_client_id>
      client_secret: <github_secret>
  keycloak:
    initial_root_password: <initial_root_password>
    overrides:
      image:
        repository: quansight/nebari-keycloak
default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:2023.7.2
  jupyterlab: quay.io/nebari/nebari-jupyterlab:2023.7.2
  dask_worker: quay.io/nebari/nebari-dask-worker:2023.7.2
storage:
  conda_store: 200Gi
  shared_filesystem: 200Gi
theme:
  jupyterhub:
    hub_title: Nebari - <project_name>
    hub_subtitle: Your open source data science platform, hosted
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    logo: https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg
    display_version: true
helm_extensions: []
monitoring:
  enabled: true
argo_workflows:
  enabled: true
kbatch:
  enabled: true
cdsdashboards:
  enabled: true
  cds_hide_user_named_servers: true
  cds_hide_user_dashboard_servers: false
terraform_state:
  type: local
namespace: <namespace>
nebari_version: 2023.7.2
existing:
  kube_context: docker-desktop
  node_selectors:
    general:
      key: kubernetes.io/os
      value: linux
    user:
      key: kubernetes.io/os
      value: linux
    worker:
      key: kubernetes.io/os
      value: linux
profiles:
  jupyterlab:
  - display_name: Small Instance
    description: Stable environment with 2 cpu / 8 GB ram
    default: true
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1.5
      mem_limit: 8G
      mem_guarantee: 5G
  - display_name: Medium Instance
    description: Stable environment with 4 cpu / 16 GB ram
    kubespawner_override:
      cpu_limit: 4
      cpu_guarantee: 3
      mem_limit: 16G
      mem_guarantee: 10G
  dask_worker:
    Small Worker:
      worker_cores_limit: 2
      worker_cores: 1.5
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
    Medium Worker:
      worker_cores_limit: 4
      worker_cores: 3
      worker_memory_limit: 16G
      worker_memory: 10G
      worker_threads: 4
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python=3.10.8
    - ipykernel=6.21.0
    - ipywidgets==7.7.1
    - nebari-dask ==2023.7.2
    - python-graphviz=0.20.1
    - pyarrow=10.0.1
    - s3fs=2023.1.0
    - gcsfs=2023.1.0
    - numpy=1.23.5
    - numba=0.56.4
    - pandas=1.5.3
    - pip:
      - kbatch==0.4.1
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python=3.10
    - cdsdashboards-singleuser=0.6.3
    - cufflinks-py=0.17.3
    - dash=2.8.1
    - geopandas=0.12.2
    - geopy=2.3.0
    - geoviews=1.9.6
    - gunicorn=20.1.0
    - holoviews=1.15.4
    - ipykernel=6.21.2
    - ipywidgets=8.0.4
    - jupyter=1.0.0
    - jupyterlab=3.6.1
    - jupyter_bokeh=3.0.5
    - matplotlib=3.7.0
    - nebari-dask==2023.7.2
    - nodejs=18.12.1
    - numpy
    - openpyxl=3.1.1
    - pandas=1.5.3
    - panel=0.14.3
    - param=1.12.3
    - plotly=5.13.0
    - python-graphviz=0.20.1
    - rich=13.3.1
    - streamlit=1.9.0
    - sympy=1.11.1
    - voila=0.4.0
    - pip=23.0
    - pip:
      - streamlit-image-comparison==0.0.3
      - noaa-coops==0.2.1
      - dash_core_components==2.0.0
      - dash_html_components==2.0.0

I ran this with nebari deploy -c nebari-config.yaml --dns-auto-provision

Command output

No response

Versions and dependencies used.

kubectl version

Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.3

nebari --version

2023.7.2

Compute environment

kind

Integrations

Keycloak

Anything else?

This is related to issue https://github.com/nebari-dev/nebari/issues/1491. However, I am still getting the same error regardless of latest update.

iameskild commented 11 months ago

Hi @chrislevn, sorry to hear you're running into trouble with the local deployment. Local deployments have been a little tricky. Currently this only works on systems running Linux (see https://github.com/nebari-dev/nebari/issues/1405). This is where folks, including myself, have been able to deploy to when doing so on Mac.

These docs go into more detail if you want to try again on Linux.

chrislevn commented 11 months ago

@iameskild Does this also mean for:

  1. Running deploy locally on Mac M1 but as existing Kubernetes infrastructures?
  2. Or run nebari deploy through Docker Linux Image from Mac?
iameskild commented 11 months ago

@chrislevn deploying Nebari to one of the cloud providers is supported on Mac (M1 and Intel). The issue is with deploying Nebari using Kind locally.

chrislevn commented 11 months ago

Hi @iameskild, Thank you for your responds. I just moved to test with cloud now.

I just found out that with the latest version of Nebari (2023.10.1), there is an option to --exclude-stage and I would like to exclude keycloak for now [1].

After upgrading it, I can't deploy Nebari, even after regenerating a new config file. The issue was

ValidationError: 1 validation error for ExistingInputVars
kube_context
  none is not an allowed value (type=type_error.none.not_allowed)

In my nebari-config.yaml, the config is:

existing:
  kube_context: vke-2e483d2a-3941-4906-97ab-d1f632ef983b
  node_selectors:
    general:
      key: vke.vultr.com/node-pool
      value: general
    user:
      key: vke.vultr.com/node-pool
      value: user
    worker:
      key: vke.vultr.com/node-pool
      value: worker

[2]

My questions are: [1] . How can I properly use --exclude-stage flag to skip keycloak stage? [2]. My kube_context is not none but why it is showing "none"? I couldn't find anything from the documentation (I think the documentation was for 2023.07.02). What can I do to resolve this?

Here is the whole config file I would like to deploy:

project_name: <project>
provider: existing
domain: <domain>
certificate:
  # type: self-signed
  # type: existing
  # secret_name: dev
  type: lets-encrypt
  acme_email: <acme_email>
  acme_server: https://acme-v02.api.letsencrypt.org/directory
security:
  authentication:
    type: password
  keycloak:
    initial_root_password: NgqSU5t2dTMzLKsh
default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:2023.10.1
  jupyterlab: quay.io/nebari/nebari-jupyterlab:2023.10.1
  dask_worker: quay.io/nebari/nebari-dask-worker:2023.10.1
storage:
  conda_store: 200Gi
  shared_filesystem: 200Gi
theme:
  jupyterhub:
    hub_title: Nebari - cqai
    hub_subtitle: Your open source data science platform, hosted
    welcome: Welcome! Learn about Nebari's features and configurations in <a href="https://www.nebari.dev/docs">the
      documentation</a>. If you have any questions or feedback, reach the team on
      <a href="https://www.nebari.dev/docs/community#getting-support">Nebari's support
      forums</a>.
    logo: 
      https://raw.githubusercontent.com/nebari-dev/nebari-design/main/logo-mark/horizontal/Nebari-Logo-Horizontal-Lockup-White-text.svg
    display_version: true
helm_extensions: []
monitoring:
  enabled: true
argo_workflows:
  enabled: true
kbatch:
  enabled: true
terraform_state:
  type: local
namespace: dev
nebari_version: 2023.10.1
existing:
  kube_context: vke-2e483d2a-3941-4906-97ab-d1f632ef983b
  node_selectors:
    general:
      key: vke.vultr.com/node-pool
      value: general
    user:
      key: vke.vultr.com/node-pool
      value: user
    worker:
      key: vke.vultr.com/node-pool
      value: worker
profiles:
  jupyterlab:
  - display_name: Small Instance
    description: Stable environment with 2 cpu / 8 GB ram
    default: true
    kubespawner_override:
      cpu_limit: 2
      cpu_guarantee: 1.5
      mem_limit: 8G
      mem_guarantee: 5G
  - display_name: Medium Instance
    description: Stable environment with 4 cpu / 16 GB ram
    kubespawner_override:
      cpu_limit: 4
      cpu_guarantee: 3
      mem_limit: 16G
      mem_guarantee: 10G
  dask_worker:
    Small Worker:
      worker_cores_limit: 2
      worker_cores: 1.5
      worker_memory_limit: 8G
      worker_memory: 5G
      worker_threads: 2
    Medium Worker:
      worker_cores_limit: 4
      worker_cores: 3
      worker_memory_limit: 16G
      worker_memory: 10G
      worker_threads: 4
environments:
  environment-dask.yaml:
    name: dask
    channels:
    - conda-forge
    dependencies:
    - python=3.10.8
    - ipykernel=6.21.0
    - ipywidgets==7.7.1
    - nebari-dask ==2023.7.2
    - python-graphviz=0.20.1
    - pyarrow=10.0.1
    - s3fs=2023.1.0
    - gcsfs=2023.1.0
    - numpy=1.23.5
    - numba=0.56.4
    - pandas=1.5.3
    - pip:
      - kbatch==0.4.1
  environment-dashboard.yaml:
    name: dashboard
    channels:
    - conda-forge
    dependencies:
    - python=3.10
    - cdsdashboards-singleuser=0.6.3
    - cufflinks-py=0.17.3
    - dash=2.8.1
    - geopandas=0.12.2
    - geopy=2.3.0
    - geoviews=1.9.6
    - gunicorn=20.1.0
    - holoviews=1.15.4
    - ipykernel=6.21.2
    - ipywidgets=8.0.4
    - jupyter=1.0.0
    - jupyterlab=3.6.1
    - jupyter_bokeh=3.0.5
    - matplotlib=3.7.0
    - nebari-dask==2023.7.2
    - nodejs=18.12.1
    - numpy
    - openpyxl=3.1.1
    - pandas=1.5.3
    - panel=0.14.3
    - param=1.12.3
    - plotly=5.13.0
    - python-graphviz=0.20.1
    - rich=13.3.1
    - streamlit=1.9.0
    - sympy=1.11.1
    - voila=0.4.0
    - pip=23.0
    - pip:
      - streamlit-image-comparison==0.0.3
      - noaa-coops==0.2.1
      - dash_core_components==2.0.0
      - dash_html_components==2.0.0
prevent_deploy: false