dask / helm-chart

Helm charts for Dask
https://helm.dask.org/
92 stars 91 forks source link

Upgrade daskhub from 4.5.1->4.5.4 appears to break or change config #131

Closed kyprifog closed 2 years ago

kyprifog commented 4 years ago

Using vanilla Daskhub 4.5.1 I get the following dask config:

export API_TOKEN=$(openssl rand -hex 32)

helm upgrade --install dhub dask/daskhub --version v4.5.1 --set jupyterhub.hub.cookieSecret=${API_TOKEN} --set jupyterhub.proxy.secretToken=${API_TOKEN} --set jupyterhub.proxy.service.type=LoadBalancer --set jupyterhub.hub.services.dask-gateway.apiToken=${API_TOKEN} --set dask-gateway.gateway.auth.jupyterhub.apiToken=${API_TOKEN} --set jupyterhub.proxy.service.type=LoadBalancer
dask.config.config
>>
{'distributed': {'version': 2,
  'dashboard': {'link': '/user/{JUPYTERHUB_USER}/proxy/{port}/status'},
  'scheduler': {'idle-timeout': '3600s'},
  'admin': {'tick': {'limit': '5s'}}},
 'logging': {'distributed': 'warning',
  'bokeh': 'critical',
  'tornado': 'critical',
  'tornado.application': 'error'},
 'kubernetes': {'name': 'dask-{JUPYTERHUB_USER}-{uuid}',
  'worker-template': {'spec': {'serviceAccount': 'daskkubernetes',
    'restartPolicy': 'Never',
    'containers': [{'name': 'dask-worker',
      'image': '${JUPYTER_IMAGE_SPEC}',
      'args': ['dask-worker',
       '--nthreads',
       '2',
       '--no-dashboard',
       '--memory-limit',
       '7GB',
       '--death-timeout',
       '60'],
      'resources': {'limits': {'cpu': '1.75', 'memory': '7G'},
       'requests': {'cpu': 1, 'memory': '7G'}}}]}}},
 'labextension': {'factory': {'module': 'dask_gateway',
   'class': 'GatewayCluster',
   'args': [],
   'kwargs': {}}},
 'gateway': {'auth': {'type': 'jupyterhub'},
  'public_address': '/services/dask-gateway/',
  'address': 'http://100.67.197.82:8000/services/dask-gateway/',
  'proxy_address': 'gateway://traefik-dhub-dask-gateway.default:80'},
 'root_config': '/srv/conda/etc',
 'temporary-directory': None,
 'dataframe': {'shuffle-compression': None},
 'array': {'svg': {'size': 120}},
 'optimization': {'fuse': {'active': True,
   'ave-width': 1,
   'max-width': None,
   'max-height': None,
   'max-depth-new-edges': None,
   'subgraphs': None,
   'rename-keys': True}}}

And as a result when I go into the dask labextension and click new cluster I get a GatewayCluster. This seems to be working correctly.

However, without changing anything but the daskhub version to 4.5.4 (this is current version so same command removing --version v4.5.1), I get the following dask.config.config:

{'gateway': {'auth': {'type': 'jupyterhub'},
  'public_address': '/services/dask-gateway/',
  'address': 'http://100.70.59.227:8000/services/dask-gateway/',
  'proxy_address': 'gateway://traefik-dhub-dask-gateway.default:80'},
 'root_config': '/srv/conda/etc',
 'temporary-directory': None,
 'dataframe': {'shuffle-compression': None},
 'array': {'svg': {'size': 120}, 'slicing': {'split-large-chunks': None}},
 'optimization': {'fuse': {'active': True,
   'ave-width': 1,
   'max-width': None,
   'max-height': inf,
   'max-depth-new-edges': None,
   'subgraphs': None,
   'rename-keys': True}}}

and as a result when I create a new cluster it spins up a LocalCluster (missing dask lab extension amongst other configs) Is this intentional? Can some documentation be added to configure 4.5.4 in the same way that 4.5.1 was configured?

kyprifog commented 4 years ago

git diff 0.8.0 0.9.0 on dask-gateway leads me to believe one of these changes caused the difference I'm seeing here.

jacobtomlinson commented 3 years ago

@jcrist could you help point in the direction of what needs updating here?

TomAugspurger commented 3 years ago

@kyprifog can you confirm that you're using the default single user image?

https://github.com/pangeo-data/pangeo-docker-images/pull/152 changed how the config works. Will need to look into how to do this properly.

kyprifog commented 3 years ago

@TomAugspurger yeah I'm using the default for everything, this is the only thing i'm overriding:

jupyterhub:
  loglevel: TRACE
  proxy:
    https:
      enabled: true
      hosts:
        - <HOST>
      letsencrypt:
        contactEmail: <EMAIL>

You can see the full command I'm using above to spin it up, just default remote dask/daskhub, i'm hitting upgrade once to get the ssl stuff I referenced here.

TomAugspurger commented 3 years ago

OK, thanks.

For now, I recommend baking the config into your application config (similar to https://github.com/pangeo-data/pangeo-cloud-federation/pull/820/files)

kyprifog commented 3 years ago

@TomAugspurger sorry for being dense about this but do you know how I would configure these bits using that approach?

 'gateway': {'auth': {'type': 'jupyterhub'},
  'public_address': '/services/dask-gateway/',
  'address': 'http://100.67.197.82:8000/services/dask-gateway/',
  'proxy_address': 'gateway://traefik-dhub-dask-gateway.default:80'},

I tried this but with no luck (gets stuck with "Cluster starting..." in lab extension)

              DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
              DASK_GATEWAY__PROXY_ADDRESS: "gateway://traefik-dhub-dask-gateway.default:80"
              DASK_GATEWAY__ADDRESS: "http://proxy-http.default:8000/services/dask-gateway/"
              DASK_GATEWAY__AUTH__TYPE: "jupyterhub"
              DASK_DISTRIBUTED__DASHBOARD_LINK: '/user/{JUPYTERHUB_USER}/proxy/{port}/status'
              DASK_LABEXTENSION__FACTORY__MODULE: 'dask_gateway'
              DASK_LABEXTENSION__FACTORY__CLASS: 'GatewayCluster'

Maybe i'll just wait to hear back from @jcrist because I feel like that correct config above has alot of moving parts I may be missing other essential pieces.

TomAugspurger commented 3 years ago

At a glance, that looks about right. I think that’s what we’re using in pangeo’s deployments (though it’s possible things aren’t working there either).

On Nov 18, 2020, at 9:03 PM, Kyle Prifogle notifications@github.com wrote:

@TomAugspurger https://github.com/TomAugspurger sorry for being dense about this but do you know how I would configure these bits using that approach?

'gateway': {'auth': {'type': 'jupyterhub'}, 'public_address': '/services/dask-gateway/', 'address': 'http://100.67.197.82:8000/services/dask-gateway/', 'proxy_address': 'gateway://traefik-dhub-dask-gateway.default:80'}, I tried this but with no luck (gets stuck with "Cluster starting..." in lab extension)

          DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
          DASK_GATEWAY__PROXY_ADDRESS: "gateway://traefik-dhub-dask-gateway.default:80"
          DASK_GATEWAY__ADDRESS: "http://proxy-http.default:8000/services/dask-gateway/"
          DASK_GATEWAY__AUTH__TYPE: "jupyterhub"
          DASK_DISTRIBUTED__DASHBOARD_LINK: '/user/{JUPYTERHUB_USER}/proxy/{port}/status'
          DASK_LABEXTENSION__FACTORY__MODULE: 'dask_gateway'
          DASK_LABEXTENSION__FACTORY__CLASS: 'GatewayCluster'

Maybe i'll just wait to hear back from @jcrist https://github.com/jcrist because I feel like that correct config above has alot of moving parts I may be missing other essential pieces.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/helm-chart/issues/131#issuecomment-730095938, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIRLJGVNWJSTU6ZDV6LSQSDHVANCNFSM4TSMIAWQ.

kyprifog commented 3 years ago

@jcrist @TomAugspurger Any further thoughts on this? I noticed this issue that could possible be related? https://github.com/dask/dask-gateway/issues/348

I have not had any luck with changing those variables to get it working like it was in 4.5.1

kyprifog commented 3 years ago

Awaiting: https://github.com/dask/dask-gateway/issues/381 (I think)

consideRatio commented 2 years ago

Awaiting: https://github.com/dask/dask-gateway/issues/381 (I think)

Closed as resolved by a new release of dask-gateway.