eWaterCycle / infra

Instructions for system administrators to deploy the eWaterCycle platform
Apache License 2.0
0 stars 0 forks source link
ansible ewatercycle hydrology

Instructions for system administrators to deploy the eWaterCycle platform

Ansible Lint Concept DOI

This repo contains (codified) instructions for deploying the eWaterCycle platform. The target audience of these instructions are system administrators. For more information on the eWaterCycle platform (and how to deploy it) see the eWaterCycle documentation.

For instructions on how to use the machine as deployed by this repo see the User guide.

These instructions assume you have some basic knowledge of vagrant and Ansible.

Setup of eWaterCycle platform on the SURF Research cloud

The hardware environment used by the eWaterCycle platform development team is the SURF Research Cloud. Starting a machine on the Surf Research Cloud requires that you have research budget with SURF, for more info see the website of SURF. Once running, access to the machine can be shared to anyone.

The setup instructions in this repo will create an eWaterCycle application(a sort-of VM template) that when started will create a machine with:

An application on the SURF Research cloud is provisioned by running an Ansible playbook (research-cloud-plugin.yml).

In addition to the standard VM storage, additional read-only datasets are mounted at /mnt/data from dCache using rclone. They may contain things like:

Previously the eWatercycle platform consisted of multiple VM on SURF HPC cloud, see v0.1.2 release for that code.

Setup of eWaterCycle platform on a local test VM

Deploying a local test VM is mostly useful for developing the SURF Research Cloud applications. This vagrant setup creates a virtual machine with 8Gb memory, 4 virtual cores, and 70Gb storage. This should work on any Linux or Windows machine.

To set up an Explorer/Jupyter server on your local machine with vagrant and Ansible

Create config file research-cloud-plugin.vagrant.vars with

---
dcache_ro_token: <dcache macaroon with read permission>
rclone_cache_dir: /data/volume_2
# Directory where /home should point to
alt_home_location: /data/volume_3

The token can be found in the eWaterCycle password manager.

vagrant --version
# Vagrant 2.4.1
vagrant plugin install vagrant-vbguest
# Installed the plugin 'vagrant-vbguest (0.32.0)'
vagrant up

Visit site

# Get ip of server with
vagrant ssh -c 'ifconfig eth1'

Go to http://<ip of eth1> and login with vagrant:vagrant.

You will get some complaints about unsecure serving, this is OK for local testing and this will not happen on Research Cloud.

Test on Windows Subsystem for Linux 2

WSL2 users should follow steps on https://www.vagrantup.com/docs/other/wsl.

Importantly:

Catalog item registration

This chapter is dedicated for catalog item developers.

On the Research cloud the developer can add an catalog item for other people to use. The generic steps to do this are documented here.

For eWatercycle component following specialization was done

For eWatercycle catalog item following specialization was done

To become root on a VM the user needs to be member of the src_co_admin group on SRAM. See docs.

SURF Research cloud VM deployment

This chapter is dedicated for application deployers.

  1. Log into Research Cloud
  2. Create new storage item for home directories
    • To store user files
    • Use 50Gb size for simple experiments or bigger when required for experiment.
    • As each storage item can only be used by a single workspace, give it a name and description so you know which workspace and storage items go together.
  3. Create new storage item for cache
    • To store cached files from dCache by rclone
    • Use 50GB size as size
    • As each storage item can only be used by a single workspace, give it a name and description so you know which workspace and storage items go together.
  4. Create a new workspace
  5. Select eWaterCycle application
  6. Select collaborative organisation (CO) for example ewatercycle-nlesc
  7. Select size of VM (cpus/memory) based on use case
  8. Select home storage item.
    • Order in which the storage items are select is important, make sure to select home before cache storage item.
  9. Select cache storage item
  10. Wait for machine to be running
  11. Visit URL/IP
  12. When done delete machine

For a new CO make sure

End user should be invited to CO so they can login.

See User guide to see what users have to do to login or use GitHub repository.

Example notebooks

To get example notebooks end users should use following URL (with <workspace id> with your currently running workspace)

https://<workspace id
  >.workspaces.live.surfresearchcloud.nl/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2FeWaterCycle%2Fewatercycle&urlpath=lab%2Ftree%2Fewatercycle%2Fdocs%2Fexamples%2FMarrmotM01.ipynb&branch=main</workspace
>

TODO add this link to home page of server at

This link uses nbgitpuller to sync a git repo and open a notebook in it.

Fill shared data disk

This chapter is dedicated for application data preparer.

The eWatercycle system setup requires a lot of data files. For the Research cloud virtual machines we will mount a dcache bucket.

To fill the dcache bucket you can run

ansible-playbook \
  -e cds_uid=1234 -e cds_api_key <cds api key> \
  -e dcache_rw_token=<dcache macaroon with read/write permissions>
  shared-data-disk.yml

Runnig this script will download all data files to /mnt/data and upload them to dcache.

Sync dcache with existing folder elsewhere

The steps above fetch the data from original sources. If you want to sync some files from another location, say, Snellius, you can use rclone directly. In our experience, it works better to sync entire directories than to try and copy single files.

Create the file ~/.config/rclone/rclone.conf and add the following content:

[ dcache ]
type = webdav
url = https://webdav.grid.surfsara.nl:2880
vendor = other
user =
pass =
bearer_token = <dcache macaroon with read/write permissions>

You can verify your access by running an innocent rclone ls dcache:parameter-sets. The command to sync directories is rclone copy somedir dcache:parameter-sets/somedir. Beware that this will overwrite any existing files, if different!

Note: password manager can be used for exchanging macaroons.

Mount dcache on local machine

Create the file ~/.config/rclone/rclone.conf and add the following content:

[dcache]
type = webdav
url = https://webdav.grid.surfsara.nl:2880
vendor = other
user =
pass =
bearer_token = <dcache macaroon with read permissions>

Install rclone and run following command to mount dcache at ~/dcache directory.

mkdir ~/dcache
rclone mount --read-only --cache-dir /tmp/rclone-cache --vfs-cache-max-size 30G --vfs-cache-mode full dcache:/ ~/dcache

In ESMValTool config files you can use ~/dcache/climate-data/obs6 for rootpath:OBS6.

Docker images

In the eWaterCycle project we make Docker images. The images are hosted on Docker Hub . A project member can create issues here for permisison to push images to Docker Hub.

Logs

All services are running with systemd. Their logs can be viewed with journalctl. The log of the Jupyter server for each user can be followed with

journalctl -f -u jupyter-vagrant-singleuser.service

(replace vagrant with own username)