2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
103 stars 63 forks source link

New Hub: Carbon Plan #291

Closed choldgraf closed 3 years ago

choldgraf commented 3 years ago

Background

CarbonPlan is a non-profit that does work at the intersection of data science / climate modeling / advocacy. @jhamman has been running a few JupyterHubs for CP for a while now, and he's like to transfer operational duties to 2i2c.

This is likely a bit more complex than the "standard Pangeo hubs" we have set up. I believe that @jhamman has a couple of hubs that they run (perhaps he can provide context below).

Setup Information

Important Information

Deploy To Do

Follow up issues

jhamman commented 3 years ago

Hi All! Here's some info to fill in the missing info above. What we have right now is something that looks pretty similar to the Pangeo-Cloud-Federation hubs. In fact, https://github.com/carbonplan/hub is a fork of that project with two hubs in it. One in GCP, one in Azure. The Azure hub needs to be moved to a new account so will need a rebuild. The google hub is in more of a maintenance mode and doesn't need too much work beyond updating to the new dask-hub chart. A lot of our devops falls into two main areas: 1) environment management -- we are working on multiple projects that require bespoke environments, and 2) custom resources -- sometimes this means GPUs, sometimes it means large VMs, and other times, its dask-related stuff. We're likely to need more than one image per hub (more like 1+ per project we work on), so that's a consideration worth discussing.

Setup Information

Important Information

yuvipanda commented 3 years ago

@jhamman can you create AWS users with full access for me (yuvipanda@2i2c.org) and @damianavila (damianavila@2i2c.org)?

yuvipanda commented 3 years ago

I'm making dask worker instances be spot instances now. Still need to figure out how users can effectively select instance size for dask workers.

consideRatio commented 3 years ago

Should users really select instance size rather than pod CPU/memory requests though? I have two suggestions based on experience with learning-2-learn/l2lhub-deployment on AWS, not knowing if they are relevant or not - here goes.

  1. One node type per instance group. Make sure to have only one instance group of a certain CPU size to ensure the cluster-autoscaler could choose nodes correctly based on a pending pod's resource requests. If you put node types of of different sizes in an instance group it may fail to scale up a suitable node.

  2. A simple worker resource requests UX When letting the user choose their resource requests and such for a worker in the cluster options, I think it makes sense to provide choices that fit well on the nodes provided in discrete steps. In other words, instead of free-choice CPU / memory, just provide a list of allowed choices such as: 1 CPU 4 GB, 2 CPU 8 GB, 4 CPU, 16 GB, 8 CPU 32 GB, 16 CPU 64 GB. This configuration needs some slack to ensure pods fit even though a node has some daemonset running on it as well.

Example suggestion 1 - Configuration of AWS instance groups

This is from infra/ekctl-cluster-config.yaml.

  # Important about spot nodes!
  #
  # "Due to the Cluster Autoscaler’s limitations (more on that in the next
  # section) on which Instance type to expand, it’s important to choose
  # instances of the same size (vCPU and memory) for each InstanceGroup."
  #
  # ref: https://medium.com/riskified-technology/run-kubernetes-on-aws-ec2-spot-instances-with-zero-downtime-f7327a95dea
  #
  - name: worker-xlarge
    availabilityZones: [us-west-2d, us-west-2b, us-west-2a]
    minSize: 0
    maxSize: 20
    desiredCapacity: 0
    volumeSize: 80
    labels:
      worker: "true"
    taints:
      worker: "true:NoSchedule"
    tags:
      k8s.io/cluster-autoscaler/node-template/label/worker: "true"
      k8s.io/cluster-autoscaler/node-template/taint/worker: "true:NoSchedule"
    iam:
      withAddonPolicies:
        autoScaler: true
    # Spot instance configuration
    instancesDistribution:  # ref: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-autoscaling-autoscalinggroup-instancesdistribution.html
      instanceTypes:
        - m5.xlarge    # 57 pods, 4 cpu, 16 GB
        - m5a.xlarge    # 57 pods, 4 cpu, 16 GB
        - m5n.xlarge    # 57 pods, 4 cpu, 16 GB
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0
      spotAllocationStrategy: "capacity-optimized"  # ref: https://aws.amazon.com/blogs/compute/introducing-the-capacity-optimized-allocation-strategy-for-amazon-ec2-spot-instances/

  - name: worker-2xlarge
    availabilityZones: [us-west-2d, us-west-2b, us-west-2a]
    minSize: 0
    maxSize: 20
    desiredCapacity: 0
    volumeSize: 80
    labels:
      worker: "true"
    taints:
      worker: "true:NoSchedule"
    tags:
      k8s.io/cluster-autoscaler/node-template/label/worker: "true"
      k8s.io/cluster-autoscaler/node-template/taint/worker: "true:NoSchedule"
    iam:
      withAddonPolicies:
        autoScaler: true
    # Spot instance configuration
    instancesDistribution:  # ref: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-autoscaling-autoscalinggroup-instancesdistribution.html
      instanceTypes:
        - m5.2xlarge   # 57 pods, 8 cpu, 32 GB
        - m5a.2xlarge   # 57 pods, 8 cpu, 32 GB
        - m5n.2xlarge   # 57 pods, 8 cpu, 32 GB
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0
      spotAllocationStrategy: "capacity-optimized"  # ref: https://aws.amazon.com/blogs/compute/introducing-the-capacity-optimized-allocation-strategy-for-amazon-ec2-spot-instances/

  - name: worker-4xlarge
    availabilityZones: [us-west-2d, us-west-2b, us-west-2a]
    minSize: 0
    maxSize: 20
    desiredCapacity: 0
    volumeSize: 80
    labels:
      worker: "true"
    taints:
      worker: "true:NoSchedule"
    tags:
      k8s.io/cluster-autoscaler/node-template/label/worker: "true"
      k8s.io/cluster-autoscaler/node-template/taint/worker: "true:NoSchedule"
    iam:
      withAddonPolicies:
        autoScaler: true
    # Spot instance configuration
    instancesDistribution:  # ref: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-autoscaling-autoscalinggroup-instancesdistribution.html
      instanceTypes:
        - m5.4xlarge   # 233 pods, 16 cpu, 64 GB
        - m5a.4xlarge   # 233 pods, 16 cpu, 64 GB
        - m5n.4xlarge   # 233 pods, 16 cpu, 64 GB
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0
      spotAllocationStrategy: "capacity-optimized"  # ref: https://aws.amazon.com/blogs/compute/introducing-the-capacity-optimized-allocation-strategy-for-amazon-ec2-spot-instances/

Example suggestion 2 - Configuration of dask worker requests

This is Helm values configuring a daskhub Helm chart deployment.

daskhub:
  jupyterhub:
    singleuser:
      extraEnv:
        # The default worker image matches the singleuser image.
        DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE: '{JUPYTER_IMAGE_SPEC}'
        DASK_DISTRIBUTED__DASHBOARD_LINK: '/user/{JUPYTERHUB_USER}/proxy/{port}/status'
        DASK_LABEXTENSION__FACTORY__MODULE: 'dask_gateway'
        DASK_LABEXTENSION__FACTORY__CLASS: 'GatewayCluster'

  # Reference on the configuration options:
  # https://github.com/dask/dask-gateway/blob/master/resources/helm/dask-gateway/values.yaml
  dask-gateway:
    gateway:
      prefix: "/services/dask-gateway"  # Connect to Dask-Gateway through a JupyterHub service.
      auth:
        type: jupyterhub  # Use JupyterHub to authenticate with Dask-Gateway
      extraConfig:
        # This configuration represents options that can be presented to users
        # that want to create a Dask cluster using dask-gateway. For more
        # details, see https://gateway.dask.org/cluster-options.html
        #
        # The goal is to provide a simple configuration that allow the user some
        # flexibility while also fitting well well on AWS nodes that are all
        # having 1:4 ratio between CPU and GB of memory. By providing the
        # username label, we help administrators to track user pods.
        option_handler: |
          from dask_gateway_server.options import Options, Select, String, Mapping
          def cluster_options(user):
              def option_handler(options):
                  if ":" not in options.image:
                      raise ValueError("When specifying an image you must also provide a tag")

                  extra_labels = {
                      "hub.jupyter.org/username": user.name,
                  }
                  chosen_worker_cpu = int(options.worker_specification.split("CPU")[0])
                  chosen_worker_memory = 4 * chosen_worker_cpu

                  # We multiply the requests by a fraction to ensure that the
                  # worker fit well within a node that need some resources
                  # reserved for system pods.
                  return {
                      "image": options.image,
                      "worker_cores": 0.80 * chosen_worker_cpu,
                      "worker_cores_limit": chosen_worker_cpu,
                      "worker_memory": "%fG" % (0.90 * chosen_worker_memory),
                      "worker_memory_limit": "%fG" % chosen_worker_memory,
                      "scheduler_extra_pod_labels": extra_labels,
                      "worker_extra_pod_labels": extra_labels,
                      "environment": options.environment,
                  }
              return Options(
                  Select(
                      "worker_specification",
                      ["1CPU, 4GB", "2CPU, 8GB", "4CPU, 16GB", "8CPU, 32GB", "16CPU, 64GB"],
                      default="1CPU, 4GB",
                      label="Worker specification",
                  ),
                  String("image", default="my-custom-image:latest", label="Image"),
                  Mapping("environment", {}, label="Environment variables"),
                  handler=option_handler,
              )
          c.Backend.cluster_options = cluster_options
        idle: |
          # timeout after 30 minutes of inactivity
          c.KubeClusterConfig.idle_timeout = 1800
choldgraf commented 3 years ago

A simple worker resource requests UX

This makes a lot of sense to me. Some kind of Fibonacci-style set of options could simplify the decision fatigue a bit

damianavila commented 3 years ago

I like @consideRatio ‘s proposal. I also think there is some value into providing preconfigured options. That should simplify the implementation (not having to deal with unknown weird config values) and reduce the decision fatigue at the the user choice event as @choldgraf pointed out.

orianac commented 3 years ago

Hi all- Thanks for getting this hub going! I'm a research scientist at CarbonPlan with @jhamman. We've started working on the hub and it's looking great. I've noticed that when firing up a cluster I need to specify options like so:

gateway = Gateway()
options = gateway.cluster_options()
options.environment = {'AWS_REQUEST_PAYER': 'requester'}

but it appears an administrator needs to explicitly expose those options in the configuration (https://gateway.dask.org/cluster-options.html). Thanks in advance!

yuvipanda commented 3 years ago

Thanks for reporting that, @orianac! I've deployed https://github.com/2i2c-org/pilot-hubs/pull/419, and you should be able to set environment as an option now. Give it a try?

orianac commented 3 years ago

Awesome! That was probably the fastest fix I've ever experienced! 🚀 Thanks @yuvipanda!

yuvipanda commented 3 years ago

@consideRatio yeah, agree that Select is the way to go. We shouldn't expose users directly to instance names.

Need to perhaps figure a way out to keep this config in sync with kubepsawner's profiles.

damianavila commented 3 years ago

Need to perhaps figure a way out to keep this config in sync with kubepsawner's profiles.

Can you expand on these thoughts, @yuvipanda? Thx!

choldgraf commented 3 years ago

Hey all - so the Carbon Plan hub exists now right? What else do we need to do before resolving this issue?

yuvipanda commented 3 years ago

I think we need to:

choldgraf commented 3 years ago

Cool - I've updated the top comment with these new steps (feel free to update it yourself in general if you like!)

jhamman commented 3 years ago

In addition to the worker spot instance todos, I think we may need a bigger VM in the cluster config. Selecting our Huge: r5.8xlarge ~32 CPU, ~256G RAM option results in a no-scale-up-timeout: image

yuvipanda commented 3 years ago

@jhamman try again? I'll have a PR up shortly but deploy seems to help

yuvipanda commented 3 years ago

(https://github.com/2i2c-org/pilot-hubs/pull/430)

jhamman commented 3 years ago

Seems to be working now. Thanks @yuvipanda!

jhamman commented 3 years ago

Hey folks! A few updates / requests / questions after using the hub for a few weeks:

Things are generally working great

@orianac, @tcchiao, and I have been using the hubs daily. The hub has been quite stable and feature complete, so nice work on the initial rollout!

Config update requests

A few things we'd like to change in the configuration:

  1. Default user environment -> jupyterlab
  2. Increase maximum memory per worker in dask-gateway clusters, currently set to 32gb, asking for a change to 64gb

Questions

  1. Did spot instances get implemented for dask clusters?
  2. What is the instance type being used for the dask workers? We'll be running workloads that are pretty memory hungry so if its easy to do, a node pool of high-memory instances (X1?) may be in order.
  3. I'm not sure if this is on 2i2c's radar just yet but I'm curious what story around managing storage permissions via service account is? The Pangeo hubs on aws and gcp have managed to dial this in reasonably well so just surfacing for future conversation.
yuvipanda commented 3 years ago

Default user environment -> jupyterlab

Done

Increase maximum memory per worker in dask-gateway clusters, currently set to 32gb, asking for a change to 64gb

Done, I removed the upper limit

What is the instance type being used for the dask workers? We'll be running workloads that are pretty memory hungry so if its easy to do, a node pool of high-memory instances (X1?) may be in order.

Same set of instances you see when you create a new user server. Dask and notebook nodes mirror config.

I'm not sure if this is on 2i2c's radar just yet but I'm curious what story around managing storage permissions via service account is? The Pangeo hubs on aws and gcp have managed to dial this in reasonably well so just surfacing for future conversation.

Good chunk of fiddling happening on the GCP setup, not so much on AWS yet. PANGEO_SCRATCH should hopefully make its way soon

choldgraf commented 3 years ago

@jhamman in a recent meeting we decided to close out this issue and consider it "finished" since it is just the "first deployment" issue, and the CP hub has been running for a couple months now. We've got the two extra issues about spot instances and dask workers on our deliverables board and can track those improvements there. I'll close this, but if you've got a strong objection to that feel free to speak up and we can discuss!