ComputeCanada / magic_castle

Terraform modules to replicate the HPC user experience in the cloud
MIT License
114 stars 34 forks source link

Add specs recommendation based on workshop attendance #2

Open cmd-ntrf opened 4 years ago

plstonge commented 4 years ago

Main nodes

  1. Minimum:
    • mgmt = { type = "p4-6gb", count = 1 }
    • login = { type = "p2-3gb", count = 1 }
  2. One has to be careful about the total memory usage of all users compiling stuff simultaneously on the login node. Suggestions:
    • login = { type = "p4-6gb", count = 1 } for a group of 20 participants
    • login = { type = "p8-12gb", count = 1 } for a group of 40 participants or more

Compute nodes - for each type of workshops

Storage

cmd-ntrf commented 4 years ago

@plstonge said this regarding login node flavor:

installing an R package on the login node took too much memory when 20+ participants were doing the installation simultaneously. The login node almost crashed. One workaround consists of pre-installing the package before the workshop, but this is not the same as doing it as an exercise.

mboisson commented 4 years ago

A useful yaml configuration file that I used for a "Intro to ARC" workshop using the terminal :

jupyterhub::jupyterhub_config_hash:
        SbatchForm:
                runtime:
                        min: 3.5
                        def: 3.5
                        max: 5.0
                nprocs:
                        min: 1
                        def: 1
                        max: 1
                memory:
                        min: 1024
                        max: 2048
                        def: 2048
                oversubscribe:
                        def: true
                        lock: true
                ui:
                        def: 'terminal'
        SlurmFormSpawner:
                disable_form: true

jupyterhub::enable_otp_auth: false

#jupyterhub::kernel::venv::packages: ['numpy', 'pandas', 'matplotlib']

profile::cvmfs::client::lmod_default_modules: ['nixpkgs/16.09', 'imkl/2018.3.222', 'gcc/7.3.0', 'openmpi/3.1.2', 'ipython-kernel/3.7', 'rstudio-server', 'openrefine']``` 

with the option

  hieradata = file("config.yaml")

in the main.tf

mboisson commented 4 years ago

I think @ccoulombe has some interesting parameters for admin users as well.

plstonge commented 4 years ago

For a (Data Carpentry) Data Analysis with Python workshop, with 50 participants and up to 6 host/co-hosts (with Admin role on Jupyter). I had 7 nodes of 8c-12gb. The config.yaml used was:

jupyterhub::admin_groups: ['user51', 'user52', 'user53', 'user54', 'user55', 'user56']

jupyterhub::enable_otp_auth: false

jupyterhub::jupyterhub_config_hash:
        SbatchForm:
                runtime:
                        min: 3.5
                        def: 3.5
                        max: 5.0
                nprocs:
                        min: 1
                        def: 1
                        max: 1
                memory:
                        min: 1024
                        def: 1280
                        max: 1536
                oversubscribe:
                        def: true
                        lock: true
                ui:
                        def: 'notebook'
        SlurmFormSpawner:
                disable_form: true

jupyterhub::kernel::venv::packages: ['numpy', 'pandas', 'matplotlib', 'plotnine']

profile::cvmfs::client::lmod_default_modules: ['nixpkgs/16.09', 'imkl/2018.3.222', 'gcc/7.3.0', 'openmpi/3.1.2']