Closed guillaumeeb closed 1 year ago
In the meantime, I put back the configuration with BasicAuth and password. We can use dask-gateway, but with no dashboard and a weak authentication.
The new configuration is still a bit better: Client, Scheduler and Workers are using the same image (pangeo/pangeo-notebook:latest), so we don't need to install extra packages on the workers.
And I also reached the end of the notebook dask_introduction with this setup, but maybe it was already OK before.
Wow, thanks @consideRatio for chiming in so quickly! I'll try a bit for an hour an report back.
That was close, definitely an improvement, but there seems to be a problem during auth:
[I 2022-08-17 06:43:49.038 JupyterHub log:189] 302 GET /hub/api/authorizations/token/[secret] -> /jupyterhub/hub/hub/api/authorizations/token/[secret] (dask-gateway@10.244.1.2) 22.28ms
[E 2022-08-17 06:43:49.152 JupyterHub web:1219] Uncaught exception in write_error
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1217, in send_error
self.write_error(status_code, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py", line 1283, in write_error
html = self.render_template('%s.html' % status_code, sync=True, **ns)
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/handlers/base.py", line 1202, in render_template
return template.render(**template_ns)
File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 1304, in render
self.environment.handle_exception()
File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 925, in handle_exception
raise rewrite_traceback_stack(source=source)
File "/usr/local/share/jupyterhub/templates/404.html", line 1, in top-level template code
{% extends "error.html" %}
File "/usr/local/share/jupyterhub/templates/error.html", line 1, in top-level template code
{% extends "page.html" %}
File "/usr/local/share/jupyterhub/templates/page.html", line 78, in top-level template code
{% if not no_spawner_check and user and user.spawner.options_form %}
File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 474, in getattr
return getattr(obj, attribute)
jinja2.exceptions.UndefinedError: 'jupyterhub.orm.Service object' has no attribute 'spawner'
[W 2022-08-17 06:43:49.154 JupyterHub log:189] 404 GET /jupyterhub/hub/hub/api/authorizations/token/[secret] (dask-gateway@10.244.1.2) 114.95ms
Dask-gateway is having trouble verifying the token? I notice a double hub
keyword in the URL above, is that normal? I probably have to configure the Jupyterhub API server URL? gateway.auth.jupyterhub.apiUrl
?
Before that, the dask-gateway request reaches the api-gateway-server, that's good!
I'll revert back for now as I have to go.
Hi,
This the configuration I tested, skipping TLS and Check-In configuration for the time being:
dask-gateway:
enabled: true
gateway:
auth:
jupyterhub:
apiToken: token1
type: jupyterhub
prefix: /services/dask-gateway
backend:
worker:
cores:
limit: 2
memory:
limit: 8G
threads: 2
traefik:
service:
type: ClusterIP
dask-kubernetes:
enabled: false
jupyterhub:
hub:
config:
Authenticator:
admin_users:
- admin
JupyterHub:
admin_access: true
authenticator_class: nativeauthenticator.NativeAuthenticator
extraConfig:
00-add-dask-gateway-values: |
# 1. Sets `DASK_GATEWAY__PROXY_ADDRESS` in the singleuser environment.
# 2. Adds the URL for the Dask Gateway JupyterHub service.
import os
# These are set by jupyterhub.
release_name = os.environ['HELM_RELEASE_NAME']
release_namespace = os.environ['POD_NAMESPACE']
if 'PROXY_HTTP_SERVICE_HOST' in os.environ:
# https is enabled, we want to use the internal http service.
gateway_address = 'http://{}:{}/services/dask-gateway/'.format(
os.environ['PROXY_HTTP_SERVICE_HOST'],
os.environ['PROXY_HTTP_SERVICE_PORT'],
)
print('Setting DASK_GATEWAY__ADDRESS {} from HTTP service'.format(gateway_address))
else:
gateway_address = 'http://proxy-public/services/dask-gateway'
print('Setting DASK_GATEWAY__ADDRESS {}'.format(gateway_address))
# Internal address to connect to the Dask Gateway.
c.KubeSpawner.environment.setdefault('DASK_GATEWAY__ADDRESS', gateway_address)
# Internal address for the Dask Gateway proxy.
c.KubeSpawner.environment.setdefault('DASK_GATEWAY__PROXY_ADDRESS', 'gateway://traefik-{}-dask-gateway.{}:80'.format(release_name, release_namespace))
# Relative address for the dashboard link.
c.KubeSpawner.environment.setdefault('DASK_GATEWAY__PUBLIC_ADDRESS', '/services/dask-gateway/')
# Use JupyterHub to authenticate with Dask Gateway.
c.KubeSpawner.environment.setdefault('DASK_GATEWAY__AUTH__TYPE', 'jupyterhub')
# Adds Dask Gateway as a JupyterHub service to make the gateway available at
# {HUB_URL}/services/dask-gateway
service_url = 'http://traefik-{}-dask-gateway.{}'.format(release_name, release_namespace)
for service in c.JupyterHub.services:
if service['name'] == 'dask-gateway':
if not service.get('url', None):
print('Adding dask-gateway service URL')
service.setdefault('url', service_url)
break
else:
print('dask-gateway service not found. Did you set jupyterhub.hub.services.dask-gateway.apiToken?')
nodeSelector:
node-role.kubernetes.io/master: ""
services:
dask-gateway:
apiToken: token1
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
ingress:
annotations:
kubernetes.io/ingress.class: nginx
enabled: true
proxy:
chp:
nodeSelector:
node-role.kubernetes.io/master: ""
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
service:
type: ClusterIP
singleuser:
cpu:
guarantee: 1
limit: 4
defaultUrl: /lab
image:
name: pangeo/ml-notebook
tag: latest
lifecycleHooks:
postStart:
exec:
command:
- sh
- -c
- |
chmod 700 .ssh; chmod g-s .ssh; chmod 600 .ssh/*; exit 0
memory:
guarantee: 1G
limit: 8G
startTimeout: 600
storage:
capacity: 2Gi
type: dynamic
rbac:
enabled: true
Note that this deployment does not use a prefix for JupyterHub.
This is how we create the Gateway:
from dask_gateway import Gateway
gateway = Gateway(
"http://api-daskhub-dask-gateway.daskhub:8000/",
)
With:
$ sudo helm list -n daskhub
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
daskhub daskhub 6 2022-08-13 09:52:06.606018286 +0000 UTC deployed daskhub-2022.6.0 2022.6.1
I am sorry I lack the background to answer the rest of your questions.
I hope it helps anyway!
Best regards, Sebastian
@j34ni I'm currently trying new things, I just see yo conected to the Hub, that might not work.
Finally! I have a working setup!!
We now have:
This was definitely the /jupyterhub/ prefix missing around, I had to reconfigure several things.
Here is a working values.yaml file without the secrets:
dask-gateway:
enabled: true
gateway:
prefix: /jupyterhub/services/dask-gateway
auth:
type: jupyterhub
jupyterhub:
apiToken: "token1"
apiUrl: "http://proxy-public/jupyterhub/hub/api"
extraConfig:
optionHandler: |
from dask_gateway_server.options import Options, Integer, Float, String
def options_handler(options):
if ":" not in options.image:
raise ValueError("When specifying an image you must also provide a tag")
return {
"worker_cores": options.worker_cores,
"worker_memory": int(options.worker_memory * 2 ** 30),
"image": options.image,
}
c.Backend.cluster_options = Options(
Integer("worker_cores", default=1, min=1, max=4, label="Worker Cores"),
Float("worker_memory", default=1, min=1, max=8, label="Worker Memory (GiB)"),
String("image", default="pangeo/pangeo-notebook:latest", label="Image"),
handler=options_handler,
)
dask-kubernetes:
enabled: false
jupyterhub:
hub:
baseUrl: /jupyterhub/
services:
dask-gateway:
apiToken: "token1"
extraConfig:
# Register Dask Gateway service and setup singleuser environment.
00-add-dask-gateway-values: |
# 1. Sets `DASK_GATEWAY__PROXY_ADDRESS` in the singleuser environment.
# 2. Adds the URL for the Dask Gateway JupyterHub service.
import os
# These are set by jupyterhub.
release_name = os.environ["HELM_RELEASE_NAME"]
release_namespace = os.environ["POD_NAMESPACE"]
if "PROXY_HTTP_SERVICE_HOST" in os.environ:
# https is enabled, we want to use the internal http service.
gateway_address = "http://{}:{}/services/dask-gateway/".format(
os.environ["PROXY_HTTP_SERVICE_HOST"],
os.environ["PROXY_HTTP_SERVICE_PORT"],
)
print("Setting DASK_GATEWAY__ADDRESS {} from HTTP service".format(gateway_address))
else:
gateway_address = "http://proxy-public/jupyterhub/services/dask-gateway"
print("Setting DASK_GATEWAY__ADDRESS {}".format(gateway_address))
# Internal address to connect to the Dask Gateway.
c.KubeSpawner.environment.setdefault("DASK_GATEWAY__ADDRESS", gateway_address)
# Internal address for the Dask Gateway proxy.
c.KubeSpawner.environment.setdefault("DASK_GATEWAY__PROXY_ADDRESS", "gateway://traefik-{}-dask-gateway.{}:80".format(release_name, release_namespace))
# Relative address for the dashboard link.
c.KubeSpawner.environment.setdefault("DASK_GATEWAY__PUBLIC_ADDRESS", "/jupyterhub/services/dask-gateway/")
# Use JupyterHub to authenticate with Dask Gateway.
c.KubeSpawner.environment.setdefault("DASK_GATEWAY__AUTH__TYPE", "jupyterhub")
# Adds Dask Gateway as a JupyterHub service to make the gateway available at
# {HUB_URL}/services/dask-gateway
service_url = "http://traefik-{}-dask-gateway.{}".format(release_name, release_namespace)
for service in c.JupyterHub.services:
if service["name"] == "dask-gateway":
if not service.get("url", None):
print("Adding dask-gateway service URL")
service.setdefault("url", service_url)
break
else:
print("dask-gateway service not found. Did you set jupyterhub.hub.services.dask-gateway.apiToken?")
config:
GenericOAuthenticator:
allowed_groups:
- urn:mace:egi.eu:group:vo.pangeo.eu:role=member#aai.egi.eu
authorize_url: https://aai-dev.egi.eu/auth/realms/egi/protocol/openid-connect/auth
claim_groups_key: eduperson_entitlement
client_id: id
client_secret: secret
login_service: EGI Check-In
oauth_callback_url: https://pangeo-foss4g.vm.fedcloud.eu/jupyterhub/hub/oauth_callback
scope:
- openid
- email
- profile
- eduperson_entitlement
token_url: https://aai-dev.egi.eu/auth/realms/egi/protocol/openid-connect/token
userdata_params:
state: state
userdata_url: https://aai-dev.egi.eu/auth/realms/egi/protocol/openid-connect/userinfo
username_key: preferred_username
JupyterHub:
authenticator_class: generic-oauth
ingress:
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: "letsencrypt-prod"
enabled: true
tls:
- hosts:
- pangeo-foss4g.vm.fedcloud.eu
secretName: pangeo-foss4g.vm.fedcloud.eu
proxy:
secretToken: "token2"
service:
type: ClusterIP
singleuser:
cpu:
guarantee: 1
limit: 2
image:
name: pangeo/pangeo-notebook
tag: latest
memory:
guarantee: 4G
limit: 8G
extraEnv:
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
We should try to store it in this repo, I'm not sure how to handle the secret part yet. We can split the values.yaml file into two separated files, but we need a github tool to encrypt the secret part, does someone knows one?
@sebastian-luna-valero thanks for your help.
In your configuration, you override a lot of daskhub default with no changes, is this intended for explanations here?
Moreover, with this config (and this is working with the one I printed above), you should be able to just create a Gateway with empty args:
from dask_gateway import Gateway
gateway = Gateway()
@guillaumeeb : Awesome !
As soon as the issues with the IM Dashboard will be resolved we should be able to add nodes to the existing pangeo-foss4g (or create a new infrastructure, without Binderhub?) and make it large enough to accommodate ~25 users at the workshop
@j34ni it'd be really nice if you could validate this setup (and maybe @tinaok if she has time).
I confirm that https://pangeo-foss4g.vm.fedcloud.eu/jupyterhub/ I can connect and can start to verify tutorial notebook there.
I wasn't careful enough for the first try and I went to https://pangeo-foss4g.vm.fedcloud.eu/jupyterhub and i got into 'binder 404: not found' page...
@guillaumeeb , Do you find a way to kill gateway cluster that I started with my user account? That was the concern of @j34ni at some point.
@tinaok : I cannot check right now (there are not enough CPUs left to allow me to even login), however it is unlikely that another user can shutdown or even see your cluster since it uses JupyterHub to authenticate with Dask Gateway.
Great news @guillaumeeb thank you very much! I built the values.yaml
above based on:
@j34ni you can delete user pods with kubectl
after ssh'ing into the cluster. To prevent this problem from reocurring in the future, I suggest you should explictly delete dask clusters from the trainee's notebook following:
I think it is safe to close this issue now.
Thanks everyone here for the inputs.
@sebastian-luna-valero I was just meaning that you don't have to override the default values provided by https://artifacthub.io/packages/helm/dask/daskhub (but you probably know that).
@j34ni @tinaok I actually don't know if Jupyterhub services, auth and token prevents other dask-gateway users to see clusters. But as @sebastian-luna-valero said, one of the best way to clean things up is to use kubectl if users didn't do it. You just need to delete the correct Scheduler pod, and all the worker pods connected to it will be shut down in a minute.
@sebastian-luna-valero I would really appreciate you reopen your pull request documenting how to proceed for deploying an Daskhub on CESNET. There is a lot of important things in it, we should just update it a bit.
I think it is safe to close this issue now.
Thanks everyone here for the inputs.
@sebastian-luna-valero I was just meaning that you don't have to override the default values provided by https://artifacthub.io/packages/helm/dask/daskhub (but you probably know that).
@j34ni @tinaok I actually don't know if Jupyterhub services, auth and token prevents other dask-gateway users to see clusters. But as @sebastian-luna-valero said, one of the best way to clean things up is to use kubectl if users didn't do it. You just need to delete the correct Scheduler pod, and all the worker pods connected to it will be shut down in a minute.
@sebastian-luna-valero I would really appreciate you reopen your pull request documenting how to proceed for deploying an Daskhub on CESNET. There is a lot of important things in it, we should just update it a bit.
I tried hard tonight to find a working setup on pangeo-foss4g platform, but with no luck.
It seems that Jupyterlab call to Jupyterhub services/dask-gateway is always failing. I did not figured out why.
See errors:
The route to dask-gateway looks good (from what I see in the hub pod logs), so I'm not sure from what it comes. I tried several variations of dask-gateway and jupyterhub configurations.
Here the one I'm at currently:
Key points I've been playing with:
dask-gateway.gateway.prefix
: I suspect that having our jupyterhub at /jupyterhub/ makes the configuration complex. I'd like to try a configuration without this prefix at all.jupyterhub.hub.services.dask-gateway.url
: I tried to bypass traefik here, but with no luck.jupyterhub.proxy.service.type
: I believe that the default LoadBalancer type is not available on this Kubernetes instance, and probably not necessary as we uses an ingress. Changing this to ClusterIP allows the helm chart update to end successfully.So the next step would be to delete all the binderhub helm config in this platform, and try to deploy jupyterhub at the root of the DNS name, I guess, I see no other thing to try currently.
Finally, I really don't know how dask-gateway was able to work before, but this is certainly not due to the last helm chart values where it was deactivated.
With the above setup, using dask-gateway should be as simple as calling:
cc @j34ni @sebastian-luna-valero for any thought on this.
Also pinging @jacobtomlinson and @consideRatio for external help if they have time for a quick glance, even if they really don't know the context here...