error updating pangeo.pydata.org

rabernat commented 6 years ago

I just added a new notebook docker image in #20.

Now I am trying to deploy it.

$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "pangeo" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. ⎈ Happy Helming!⎈ 
$ helm upgrade jupyter pangeo/pangeo -f jupyter-config.yaml -f secret-config.yaml --version 0.1.0-c1651d3
2018/05/01 11:25:54 warning: cannot overwrite table with non table for extraConfig (map[])
2018/05/01 11:25:54 warning: cannot overwrite table with non table for extraConfig (map[])
Error: UPGRADE FAILED: Deployment.apps "proxy" is invalid: spec.template.metadata.labels:
Invalid value: map[string]string{"heritage":"Tiller", "hub.jupyter.org/network-access-hub":"true", "hub.jupyter.org/network-access-singleuser":"true", "name":"proxy", "release":"jupyter", "component":"proxy"}: `selector` does not match template `labels`

No idea what to make of this error.

jacobtomlinson commented 6 years ago

Hmm this is odd. I think it is related to jupyterhub/zero-to-jupyterhub-k8s#653.

Maybe @yuvipanda has some thoughts?

rabernat commented 6 years ago

I can google my way to relevant-sounding issues https://github.com/kubernetes/helm/issues/1844 https://github.com/kubernetes/helm/issues/2437

But I don't know if they are actually relevant.

$ helm version
Client: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.1", GitCommit:"f38e43b221d08850172a9a4ea785a86a3ffa3b3a", GitTreeState:"clean", BuildDate:"2017-10-11T23:27:35Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.3-gke.0", GitCommit:"a7b719f7d3463eb5431cf8a3caf5d485827b4210", GitTreeState:"clean", BuildDate:"2018-02-16T18:26:01Z", GoVersion:"go1.9.2b4", Compiler:"gc", Platform:"linux/amd64"}

rabernat commented 6 years ago

My crash course in helm / kubernetes has left me with the strong impression that this whole thing is a house of cards. Everything I try gives me a cryptic error. How can we make this process work better for cloud-mortals like myself?

rabernat commented 6 years ago

Does anyone have any idea how to move forward from this error? Delete the deployment and re-install?

mrocklin commented 6 years ago

I had an odd experience using newer versions of kubernetes on GCP recently. Maybe downgrade to the default version?

mrocklin commented 6 years ago

Alternatively, maybe we need to become more accustomed to asking for help from upstream in the Helm project itself. Perhaps someone should raise a github issue?

mrocklin commented 6 years ago

OK, I've moved one step forward (I think) with https://github.com/pangeo-data/helm-chart/pull/22 but am now running into another issue:

mrocklin@carbon:~/workspace/pangeo/gce$ helm upgrade jupyter ~/workspace/helm-chart/pangeo/ -f jupyter-config.yaml -f copy-secret-config.yaml
Error: UPGRADE FAILED: no ConfigMap with the name "nginx-proxy-config" found

Perhaps someone here has thoughts

tjcrone commented 6 years ago

what happens if you try installing the deployed chart as described here: https://github.com/pangeo-data/helm-chart?

mrocklin commented 6 years ago

That would not include my change in https://github.com/pangeo-data/helm-chart/pull/22 so instead I'm deploying the chart from a local version that has been modified

On Wed, May 2, 2018 at 10:16 AM, Tim Crone notifications@github.com wrote:

what happens if you try installing the deployed chart as described here: https://github.com/pangeo-data/helm-chart?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pangeo-data/helm-chart/issues/21#issuecomment-385992904, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszA_5e5mztB1ViGJCoV7I5ez0qpXJks5tub-rgaJpZM4TuFLn .

tjcrone commented 6 years ago

I see that there is no autohttps template in that version of jupyterhub. Strange. What happens if you try v0.7-a5c532d?

mrocklin commented 6 years ago

Unfortunately that version still defines extraConfig as a mapping

$ helm inspect jupyterhub/jupyterhub --version v0.7-a5c532d | grep extraConfig:
extraConfig: {}

rabernat commented 6 years ago

I feel like some input from the jupyterhub team could potentially save us hours of trial and error here. Should we consider reaching out more directly?

mrocklin commented 6 years ago

Sure. @yuvipanda is already pinged. Lets expand to @choldgraf

tjcrone commented 6 years ago

I managed to deploy the pangeo helm chart on a fresh gce cluster using v0.7-fd73c61, the current config file in the notebook-image branch (changed the loadbalancer ip), and a stripped-down secrets file that does include https, and was unable to replicate the ConfigMap error.

secret-config.yaml:

jupyterhub:
  proxy:
    secretToken: "074fd8fb655f412996012342d2014a278e49rts0f05bcca0396f0122752597327"
    https:
      enabled: true
      hosts:
        - "test.org"
      type: letsencrypt
      letsencrypt:
        contactEmail: "tjcrone@gmail.com"

tjcrone commented 6 years ago

quick side question, did you reserve your load balancer external ip address on GCP?

rabernat commented 6 years ago

What I did yesterday was to upgrade the existing ephemeral IP to a static IP. So now I believe it will persist.

Sent from my iPhone

On May 2, 2018, at 12:23 PM, Tim Crone notifications@github.com wrote:

quick side question, did you reserve your load balancer external ip address on GCP?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

jgerardsimcock commented 6 years ago

I have not tried this yet but it seems like this may be a way forward. https://github.com/kubernetes/helm/issues/3933

tjcrone commented 6 years ago

I was able to replicate these warnings by rolling forward to v0.7-82bed4a

2018/05/02 12:44:56 warning: cannot overwrite table with non table for extraConfig (map[])
2018/05/02 12:44:56 warning: cannot overwrite table with non table for extraConfig (map[])

I needed to do a helm dependency update in order for my changes to be recognized. Also when upgrading, I usually set the --force and --recreate-pods flags.

tjcrone commented 6 years ago

It would seem to me that helm is not registering the change in location for the nginx-configmap.yaml file, which gets moved to a subdirectory in more recent versions of the jupyterlab helm chart. Using the --debug flag during upgrade and grepping for nginx-config will show you where it is looking for the file. I'm thinking that the --reset-values and/or --recreate-pods and/or --force flags could help with this situation. I would try --reset-values first.

choldgraf commented 6 years ago

also pinging @minrk here, as he may have experience with this.

minrk commented 6 years ago

Yes, I have run into this. Try making extraConfig a dict instsead of a string:

hub:
  extraConfig:
    customPodHook: |
      from kubernetes import client
      ...

I think both a scalar string and a dict/map are supposed to work, but I've found using a string results in this warning. I think it's just a warning, not an error, though.

The error about not matching labels I think is due to recent label changes, which requires you to use --force to upgrade (which does delete & replace if patch fails). We recently dealt with this on Binder.

mrocklin commented 6 years ago

I've walked back to the version of JupyterHub that we were using before and changed extraConfig to be a map in https://github.com/pangeo-data/pangeo/pull/235

This appears to be functioning well. Thank you @minrk

pangeo-data / helm-chart

error updating pangeo.pydata.org #21