Closed TomAugspurger closed 4 years ago
The error mesage on the OOI failure is a bit different actually
UPGRADE FAILED
Error: render error in "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml": template: pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml:23:28: executing "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml" at <include (print .Template.BasePath "/secret.yaml") .>: error calling include: template: pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/secret.yaml:9:19: executing "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/secret.yaml" at <required "gateway.proxyToken must be a 32 byte random string" .Values.gateway.proxyToken>: error calling required: gateway.proxyToken must be a 32 byte random string
Error: UPGRADE FAILED: render error in "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml": template: pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml:23:28: executing "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/web-proxy-deployment.yaml" at <include (print .Template.BasePath "/secret.yaml") .>: error calling include: template: pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/secret.yaml:9:19: executing "pangeo-deploy/charts/pangeo/charts/dask-gateway/templates/secret.yaml" at <required "gateway.proxyToken must be a 32 byte random string" .Values.gateway.proxyToken>: error calling required: gateway.proxyToken must be a 32 byte random string
Traceback (most recent call last):
File "/home/circleci/repo/venv/bin/hubploy", line 11, in <module>
sys.exit(main())
File "/home/circleci/repo/venv/lib/python3.7/site-packages/hubploy/__main__.py", line 89, in main
helm.deploy(args.deployment, args.chart, args.environment, args.namespace, args.set, args.version, args.timeout, args.force)
File "/home/circleci/repo/venv/lib/python3.7/site-packages/hubploy/helm.py", line 126, in deploy
force
File "/home/circleci/repo/venv/lib/python3.7/site-packages/hubploy/helm.py", line 62, in helm_upgrade
subprocess.check_call(cmd)
File "/usr/local/lib/python3.7/subprocess.py", line 347, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['helm', 'upgrade', '--wait', '--install', '--namespace', 'ooi-prod', 'ooi-prod', 'pangeo-deploy', '--timeout', '1200', '-f', 'deployments/ooi/config/common.yaml', '-f', 'deployments/ooi/config/prod.yaml', '-f', 'deployments/ooi/secrets/prod.yaml', '--set', 'pangeo.jupyterhub.singleuser.image.tag=c987384', '--set', 'pangeo.jupyterhub.singleuser.image.name=ooicloud.azurecr.io/ooi-pangeo-io-notebook']' returned non-zero exit status 1.
{"errors":[{"message":"Permission denied, wrong credentials","field":null,"help":null}]}
Exited with code exit status 1
cc @tjcrone if you perhaps already know what's failing there.
This
Error: Deployment.apps "gateway-dev-staging-dask-gateway" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"gateway", "app.kubernetes.io/instance":"dev-staging", "app.kubernetes.io/managed-by":"Tiller", "app.kubernetes.io/name":"dask-gateway"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable && Deployment.apps "scheduler-proxy-dev-staging-dask-gateway" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"scheduler-proxy", "app.kubernetes.io/instance":"dev-staging", "app.kubernetes.io/managed-by":"Tiller", "app.kubernetes.io/name":"dask-gateway"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable && Deployment.apps "web-proxy-dev-staging-dask-gateway" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/component":"web-proxy", "app.kubernetes.io/instance":"dev-staging", "app.kubernetes.io/managed-by":"Tiller", "app.kubernetes.io/name":"dask-gateway"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
looks similar to https://github.com/dask/dask-gateway/issues/147. @jcrist do you know, should that have been fixed? IIUC, we're already on 0.6.1 for staging. Perhaps is a different issue.
So you're already running 0.6.1, and you're updating values, so this isn't an upgrade of dask-gateway? If so, then this is a different issue and is odd. Can you do an upgrade with --dry-run --debug
added so that the rendered charts are output and post them somewhere? None of the labels we use for matchLabels
should change during an upgrade, I'm curious what's happening here.
Looking at the config, it seems we were still on 0.5 scheduler-proxy: daskgateway/dask-gateway-server:0.5.0
. Sorry about the confusion.
In that case, I'll manually remove the bad deployment and trigger a new one.
I've manually deleted the staging dask-gateway deployments through the google cloud platform UI. https://github.com/pangeo-data/pangeo-cloud-federation/pull/501 should redeploy things when its merged.
I think we're all good here.
Thanks Tom. Does this mean #479 can also be closed?
dev, ocean, and hydro are failing on staging. OOI is failing on prod
Staging: https://circleci.com/gh/pangeo-data/pangeo-cloud-federation/1010 Prod: https://circleci.com/gh/pangeo-data/pangeo-cloud-federation/1001
Looking into this a bit now.