2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
105 stars 64 forks source link

Discussion towards documenting guidance for deciding on a users-per-node size #3177

Open consideRatio opened 1 year ago

consideRatio commented 1 year ago

Consider the of active users from ubc-eoas last month:

image

In this graph there is a peak of 86 users at one point in time, this peak looks like this:

image

If we can choose the node size to fit any multiple of two amounts of users, what users-per-node size seems to be a good choice in this case?

Extreme 1: all users on one node

Extreme 2: user dedicated nodes

Guidande to find the middle ground

Please join me in discussing what middle ground can make sense, so that we can document guidance how to identify it.

Using ubc-eoas usage above as an example, how many users do you think could make sense for them to have per node, and based on what?

consideRatio commented 1 year ago

I suggest we settle on providing guidance based on something simple enough, such as:

Guidance draft 1

This is my initial guidance suggestion, put in three steps.

  1. (lower bound) Choose a node size that at least fits the minimum amount of users
  2. (upper bound) Choose a node size that makes peak usage result in having at least around 3 nodes
  3. (preference) Prefer a node size from one of the default size set suggested in #3176

Applied to ubc-eoas example above, assuming min users is 0, max users is 100, then we should use a node that fits at most ~33 users. Assuming that requested memory is 1GB, we arrive at a 4 CPU 32 GB node. Here is why: