A workshop to get started with the Data Science Research Infrastructure (DSRI) in an hour 🕐 (hopefully)!
During this workshop, you will:
oc
command line interfacePrerequisites:
📖 The DSRI documentation can be found at https://maastrichtu-ids.github.io/dsri-documentation
Connect to the UM VPN.
Students can use the Athena Student Desktop at athenadesktop.maastrichtuniversity.nl to access the DSRI web UI
On Linux you can use openconnect
:
sudo openconnect --passwd-on-stdin -u YOUR.UM.USER --authgroup 01-Employees vpn-rw1.maastrichtuniversity.nl
Access the DSRI OpenShift web UI
👩💻 Go to the workspace-workshop project in the OpenShift web UI
Start a JupyterLab/RStudio/VSCode application from the DSRI catalog in ids-projects
📖 See how to deploy JupyterLab, RStudio, VSCode and lots more.
👨💻 Use your name to generate a unique Application name, e.g. rstudio-vemonet
Persistent storage will create automatically.
Access the application you just started
👨💻 For small and medium size files you can simply drag and drop files and folder in the application web UI, or use the Upload files button in RStudio.
This solution works for files up to a few hundred MBs (depending on the application, use it until it fails!).
We recommend you to use git
with GitHub or GitLab, you can use it directly from the terminal in all applications, or use the web UI integration each app proposes.
📖 See the documentation for each application:
jupyterlab-git
extension installed): https://maastrichtu-ids.github.io/dsri-documentation/docs/deploy-jupyter#use-git-in-jupyterlabFor large data files you will need to install the oc
command line interface.
If you have the time it can be quickly installed on MacOS, Linux (works with WSL):
- On Linux 🐧
wget https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz tar xvf openshift-origin-client-tools*.tar.gz cd openshift-origin-client*/ sudo mv oc kubectl /usr/local/bin/
- On Mac 🍎
brew install openshift-cli
- On Windows 🏢
📖 See the complete documentation to upload large data file
💡 You will have a better connection when directly connected to the UMnet network (or eduroam at UM) to upload large data file. Even better if you can use ethernet wires.
👨💻 Stop your application from the OpenShift web UI Topology page:
You can use the Filter by name search box to quickly find your application based on the name you gave it.
Note: creating more than one pod ("Scale up") is useless for most data science applications, such as RStudio, VSCode or JupyterLab. It is only relevant for applications running as a cluster, like Apache Flink or Apache Spark, or web application with a lot of traffic (OpenShift will redirect the traffic depending on pod availability, and start new pods if required, aka. horizontal scaling).
👩💻 Delete your application:
oc
command line interface, it is easier to use it to delete all the objects related to your application:oc delete all,secret,configmaps,serviceaccount,rolebinding --selector app=my-application
Replace
my-application
by the Application name you defined.
📖 See the complete documentation to delete an application.
📝 Fill this form to help us create a project for you on the Data Science Research Infrastructure for a longer term!