UCSF-CBI / c4-ondemand-interactive-apps

0 stars 0 forks source link

Add field to load additional environment modules #18

Closed HenrikBengtsson closed 2 years ago

HenrikBengtsson commented 2 years ago

Issue

For example, for R, we sometimes need to load additional modules in order to use some R packages, e.g. R package hdf5r requires module CBI hdf5. However, when using R via RStudio OnDemand, there is no option to load environment modules.

Suggestion

Add a free-text field to the forms of all OnDemand interactive tools (i.e. RStudio, and Jupyter for now):

Environment modules to be loaded on startup (optional): [ ... text input field ... ]

Here the user should be able to enter, for instance, CBI hdf5 gdal. The OnDemand "before" scripts should then load these modules, if specified.

HenrikBengtsson commented 2 years ago

We could also extend this to be a text box with any any shell code, e.g.

module load CBI
module load hdf5 gdal

The only thing that talks against this is that there is a slight security risk, since this can execute anything on the user's behalf.

HenrikBengtsson commented 2 years ago

@hgputnam , I think this should be implemented soon. It's a blocker for certain type of packages in R. It should be straightforward to implement.

hgputnam commented 2 years ago

Will do my best to get this started this week. Lots of hardware work going on: two new storage servers, one sick compute node, on esick storage server...

hgputnam commented 2 years ago

OK, I got myself set up for testing this, etc. @HenrikBengtsson - should I assume that all modules are coming from the CBI repo or should I allow for different top level repos to allow Labs to use their own?

hgputnam commented 2 years ago

I could make a repo field as a pick list since there are only a few and they don't change often...

HenrikBengtsson commented 2 years ago

Anything on MODULEPATH, i.e. they have to load CBI, DoeLab, ... if they want to use their modules, like it works everywhere else.

HenrikBengtsson commented 2 years ago

I could make a repo field as a pick list since there are only a few and they don't change often...

Too complicated to maintain/keep up-to-date, and too much info to represent as a GUI widget. We have ~200 modules and each with different versions. Plain text field is the most flexible and generic.

hgputnam commented 2 years ago

@HenrikBengtsson check it out on https://c4-ondemand2

If it looks good I will deploy to c4-ondemand1. Personally, I think the help verbiage needs some work, not sure how to phrase the need to specify tip level repo.

HenrikBengtsson commented 2 years ago

I think it's sufficient to have:

Enter additional modules to be loaded separated by space, e.g. CBI gdal hdf5/1.12.1

Are you loading these modules before or after the R module? It might be safer to do it before, so that the selected R version always trumps whatever they specify as additional modules. That way, the r module can also make sure to have its dependencies loaded.

hgputnam commented 2 years ago

Current order:

module load <%= context.additional_modules %>
module load CBI
module load r/<%= context.r_version %>
module load rstudio-server-controller
HenrikBengtsson commented 2 years ago

Current order:

Perfect

HenrikBengtsson commented 2 years ago

FWIW, it looks like module load works also when not specifying modules, i.e. there's no error;

$ module load
$ echo $?
0

(This is with module 8.6 on my laptop)

hgputnam commented 2 years ago

I've been checking output.log and script.sh after each run. I threw in a module list so we can easily see what is happening.

WIth additional modules:

script.sh:

#!/usr/bin/env bash

#shellcheck source=before.sh.erb

module load CBI hdf5 gdal
module load CBI
module load r/4.2.0
module load rstudio-server-controller

# troubleshooting
module list    

# Start RSC launching the personal Rstudio Server instance
rsc start --port="${port:?}"

output.log:

Script starting...
Waiting for RStudio Server to open port 13183...

Currently Loaded Modules:
  1) CBI                 5) r/4.2.0
  2) hdf5/1.12.1         6) rstudio-server/2022.02.1-461
  3) gdal/2.4.4          7) expect/5.45.4
  4) scl-devtoolset/10   8) rstudio-server-controller/0.8.4

WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver.pid). This file was removed, because it referred to a rserver process (PID 6546 on c4-n12) which is no longer running.
WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver_monitor.pid). This file was removed, because it referred to a rserver_monitor process (PID 6593 on c4-n12) which is no longer running.
WARNING: Detected a stray lock file (/c4/home/hputnam/.config/rsc/pid.lock). This file was removed, because there was no evidence that another instance was running on this system (no PID files for neither 'rserver' nor 'rsession' were found).
Discovered RStudio Server listening on port 13183!
Generating connection YAML file...
hputnam, your personal RStudio Server is available on <http://c4-n12:13183>. If you are running from a remote machine without direct access to c4-n12, you can use SSH port forwarding to access the RStudio Server at <http://127.0.0.1:8787> by running 'ssh -L 8787:c4-n12:13183 hputnam@<login-machine>' in a second terminal.
Any R session started times out after being idle for 120 minutes.
WARNING: You now have 10 minutes, until 2022-06-07 15:06:12-07:00, to connect and log in to the RStudio Server before everything times out.

Without additional modules

output.log:

Script starting...
Waiting for RStudio Server to open port 13147...

Currently Loaded Modules:
  1) CBI                 4) rstudio-server/2022.02.1-461
  2) scl-devtoolset/10   5) expect/5.45.4
  3) r/4.2.0             6) rstudio-server-controller/0.8.4

WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver.pid). This file was removed, because it referred to a rserver process (PID 7154 on c4-n12) which is no longer running.
WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver_monitor.pid). This file was removed, because it referred to a rserver_monitor process (PID 7204 on c4-n12) which is no longer running.
WARNING: Detected a stray lock file (/c4/home/hputnam/.config/rsc/pid.lock). This file was removed, because there was no evidence that another instance was running on this system (no PID files for neither 'rserver' nor 'rsession' were found).
Discovered RStudio Server listening on port 13147!
Generating connection YAML file...

So, no errors thrown in either case. Not much we can do about entering a bad module name. Here I put in X for the additonal modules:

output.log:

Script starting...
Waiting for RStudio Server to open port 63436...
Lmod has detected the following error: The following module(s) are unknown:
"X"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "X"

Also make sure that all modulefiles written in TCL start with the string
#%Module

Currently Loaded Modules:
  1) CBI                 4) rstudio-server/2022.02.1-461
  2) scl-devtoolset/10   5) expect/5.45.4
  3) r/4.2.0             6) rstudio-server-controller/0.8.4

WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver.pid). This file was removed, because it referred to a rserver process (PID 7909 on c4-n12) which is no longer running.
WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver_monitor.pid). This file was removed, because it referred to a rserver_monitor process (PID 7958 on c4-n12) which is no longer running.
WARNING: Detected a stray lock file (/c4/home/hputnam/.config/rsc/pid.lock). This file was removed, because there was no evidence that another instance was running on this system (no PID files for neither 'rserver' nor 'rsession' were found).
Discovered RStudio Server listening on port 63436!
Generating connection YAML file...

So an error is thrown (invisible to the user) but RStudio still works albeit without the X module...

hgputnam commented 2 years ago

It is fully deployed. Should I send a Slack update about this?

HenrikBengtsson commented 2 years ago

Clever to add module list.

Sure, mention it on Slack is useful. I think this feature was triggered by a question there, or maybe it was the issue tracker.