UCSF-CBI / c4-ondemand-interactive-apps

0 stars 0 forks source link

RStudio Server: Add option to select version of R #11

Closed HenrikBengtsson closed 2 years ago

HenrikBengtsson commented 2 years ago

A few thoughts:

This possible values for this field should reflect the available version in the CBI stack, cf. module load CBI; module avail r. We need to figure out the best way to update these settings, so that new R versions are picked up automatically when they become available. At a minimum, this should be possible via a cronjob.

The script.sh.erb should be updated to something like:

module load CBI
module load r/"${ONDEMAND_R_VERSION}"
module load rstudio-server-controller
hgputnam commented 2 years ago

Yeah that is basically what I was thinking. It works around licensing issues... I need to find some documentation that tells me which R versions can be used with which RStudio Server versions.

HenrikBengtsson commented 2 years ago

I need to find some documentation that tells me which R versions can be used with which RStudio Server versions.

Didn't think about that constraint. I guess the easiest is just to test until it breaks, e.g.

$ module load r/4.0.0
$ rsc start
$ module load r/3.6.0
$ rsc start

and so on.

UPDATE: Forgot to make it clear that one need to login so that an R session is actually started. That's the minimal test. For full compatibility, one would have to use it in different ways. That's lots of work. I'd leave that to end users to tests i.e. if there's something that doesn't work, we'll hear about it

HenrikBengtsson commented 2 years ago

For the record, RStudio Server Pro (now called RStudio Workbench) supports selecting multiple versions from inside RStudio (https://support.rstudio.com/hc/en-us/articles/212364537-Using-multiple-versions-of-R-with-RStudio-Workbench-RStudio-Server-Pro), but we don't have that option.

hgputnam commented 2 years ago

but we don't have that option.

Yeah, that agrees with what I found. I was thinking the way forward is to package specific RStudio Server & R combinations. Those would be seperate apps with distinct labels in the OnDemand web. I think we would just hard-code the R version with:

  --rsession-which-r arg                The path to the main R program (e.g. 
                                        /usr/bin/R). This should be set if no 
                                        versions are specified in 
                                        /etc/rstudio/r-versions and the default
                                        R installation is not available on the 
                                        system path.

And possibly this would be needed, maybe not if we module load R/my-desired-version, not sure about. that:

  --rsession-ld-library-path arg        Specifies additional LD_LIBRARY_PATHs 
                                        to use for R sessions.
hgputnam commented 2 years ago

I guess this would require multiple rsc versions as well?

HenrikBengtsson commented 2 years ago

RStudio Server will pick up whatever R is on the search PATH, so all we need to do is module load r/${ONDEMAND_R_VERSION} before calling rsc.

(My comment above, was just to share that I looked into that alternative, which I know existed but not sure if we could do it when the open source version).

Advanced: FWIW, one could imagine having a custom R function, say, rsc_restart("3.6.0") that will update the config file (which the user owns) to:

  --rsession-which-r <path-to-R-3.6.0>

and then call rstudio.api::restartSession(). I think that would work, but it's very much specific to rsc, so probably better not to do.

HenrikBengtsson commented 2 years ago

I guess this would require multiple rsc versions as well?

No. If we'd have gone down the path to edit the config file (--rsession-which-r ...), we could have added options for rsc to do that. rsc don't care about the underlying version of R - it just passes on command -v R.

hgputnam commented 2 years ago

No. If we'd have gone down the path to edit the config file (--rsession-which-r ...), we could have added options for rsc to do that. rsc don't care about the underlying version of R - it just passes on command -v R.

When we load rsc today, doesn't it load both the latest rstudio-server and r modules from CBI, correct? So will we have version specifc rsc in CBI that will load particular versions of r and rstudio-server? Just trying to figure out the OnDemand piece that doesn't violate our principal of making things work the same way from srun, OnDemand, and Linux cli on a dev node.

hgputnam commented 2 years ago

@HenrikBengtsson I am struggling to find a document that shows which versions of R can be run from within which versions of RStudio Server. Don't see it here: https://docs.rstudio.com/ide/server-pro/r_versions/managing_upgrades.html

Can we have any version of R? Doesn't seem quite right...

HenrikBengtsson commented 2 years ago

@HenrikBengtsson I am struggling to find a document that shows which versions of R can be run from within which versions of RStudio Server. Don't see it here: https://docs.rstudio.com/ide/server-pro/r_versions/managing_upgrades.html

Can we have any version of R? Doesn't seem quite right...

I tried to find any info too. I suspect we can. Let's assume it'll just work. The only thing I heard of is that when R 4.1.0 came out with a new pipe syntax (a |> b), the RStudio command-line interpreter didn't know about that (obviously), so they had to upgrade RStudio. But I suspect that the most recent RStudio version is backward compatible quite far with R. I don't see why it shouldn't. Note that RStudio is just a human-to-machine GUI on top of to the R runtime.

hgputnam commented 2 years ago

OK, so the idea for rsc and rstudio-server will be to have different modules each of which loads a different r module?

HenrikBengtsson commented 2 years ago

OK, so the idea for rsc and rstudio-server will be to have different modules each of which loads a different r module?

All we need to do is (see top comment):

module load CBI
module load r/"${ONDEMAND_R_VERSION}"
module load rstudio-server-controller

So, for OnDemand, you just need to figure out how to set environment variable ONDEMAND_R_VERSION based on the form and everything should work.

hgputnam commented 2 years ago

OK, I see. You did something fancy so that loading, say r/4.03 and then loading rstudio-server-controller means we have:

$ module list

Currently Loaded Modules:
  1) CBI   2) r/4.0.3   3) rstudio-server/2021.09.2-382   4) expect/5.45.4   5) rstudio-server-controller/0.6.0

Where just loading rstudio-server-controller would give us r/4.1.2. I thought the 4.12 would replace the 4.0.3.

Ok way easier then. I'm trying to catch up on clean-up items, then I will see what I can do...

HenrikBengtsson commented 2 years ago

If you look at module show rstudio-server-controller, you'll find:

----------------------------------------------------------------------------------------------------------------
   /software/c4/cbi/modulefiles/rstudio-server-controller/0.6.0.lua:
----------------------------------------------------------------------------------------------------------------
help([[RSC: An RStudio Server Controller
]])
whatis("Version: 0.6.0")
whatis("Keywords: programming, R, RStudio Server, GUI")
whatis("URL: https://github.com/UCSF-CBI/rstudio-server-controller, https://github.com/UCSF-CBI/rstudio-server-contr
oller/blob/main/NEWS.md (changelog)")
whatis("Description: The RStudio Server Controller (RSC) is a tool for launching a personal instance of the RStudio 
Server on a Linux machine, which then can be access via the web browser, either directly or via SSH tunneling.
Examples: `rsc --help`, and `rsc start`.
Warning: This is work under construction!
")
depends_on("r")
depends_on("rstudio-server")
try_load("expect")
prepend_path("PATH","/software/c4/cbi/software/rstudio-server-controller-0.6.0/bin")

That

depends_on("r")

means: If the r module is already loaded, do nothing, otherwise load the default version.

HenrikBengtsson commented 2 years ago

I've prepared RStudio/template/script.sh.erb for this:

https://github.com/UCSF-CBI/c4-ondemand-interactive-apps/blob/c7204126b1584d87bf5508889d8246228d71ec68/RStudio/template/script.sh.erb#L5-L22

HenrikBengtsson commented 2 years ago

Without knowing much of OnDemand and form.yml, this is my proposal to update RStudio/form.yml:

---
cluster: "c4"
attributes:
  r_version:
    widget: "select"
    options:
      - ["default"]
      - ["4.2.0"]
      - ["4.1.3"]
      - ["4.1.2"]
      - ["4.1.1"]
      - ["4.1.0"]
      - ["4.0.5"]
      - ["4.0.4"]
      - ["4.0.3"]
      - ["4.0.2"]
      - ["4.0.0"]
      - ["3.6.3"]
      - ["3.6.0"]
      - ["3.5.3"]
      - ["3.5.0"]
      - ["3.3.0"]
      - ["3.2.0"]
      - ["3.1.0"]
      - ["3.0.0"]
      - ["2.15.0"]
  bc_num_hours:
    label: "Number of hours"
    min: 1
    max: 24
    step: 1
    help: "Enter a value between 1 and 24 (hours)"
  bc_queue:
    label: "Partition"
  slurm_mem:
    widget: "number_field"
    max: 256
    min: 1
    step: 1
    value: 1
    label: "Memory (GiB)"
    help: "Enter a value between 1 and 256 (GiB)"
  slurm_scratch:
    widget: "number_field"
    max: 256
    min: 1
    step: 1
    value: 1
    label: "Scratch space (GiB)"
    help: "Enter a value between 1 and 2550 (1 GiB - 2.5 TiB)"   
  slurm_cores:
    widget: "number_field"
    max: 16
    min: 1
    step: 1
    value: 1
    label: "Number of CPU cores"
    help: "Enter a value between 1 and 16 (cores)"
form:
  - bc_queue
  - bc_num_hours
  - r_version
  - slurm_mem
  - slurm_cores
  - slurm_scratch

@hgputnam does this look alright to you? The objective is to get environment r_version set for RStudio/template/script.sh.erb.

hgputnam commented 2 years ago

Agreed and what I had in mind (pick list). I will do this as time allows...

hgputnam commented 2 years ago

@HenrikBengtsson were you making changes? No problem if you are, I just don't want to step on whatever you are doing. I see the select widget in form.yml for r_version but it doesn't look like the above comment. If you are not making changes, let me know and I will try to get this done. FYI - am using c4-ondemand2 to test this sort of thing. Eventually that will be available to users but I have not yet announced its existence.

HenrikBengtsson commented 2 years ago

@HenrikBengtsson were you making changes? No problem if you are, I just don't want to step on whatever you are doing. I see the select widget in form.yml for r_version but it doesn't look like the above comment. ...

Woops. I meant to update only RStudio/template/script.sh.erb, but it seems that I also commited my local prototype of form.yml. I'm not doing anything more. Please take over from here...

hgputnam commented 2 years ago

OK, gave it my first attempt. Something is not quite correct, I get the pick list on the Rstudio form but no matter which version I pick, 4.1. gets loaded (checking with R.version from RStudio). The generated script looks like this:

#!/usr/bin/env bash

#shellcheck source=before.sh.erb

## The version of R to run according to OnDemand form.
## Environment variable 'ONDEMAND_R_VERSION' is optional.
## If set, it should specify one of the available R version
## in the CBI stack.  If not set, or "default", then it will
## use Lmod's default version of R.
##
## To find available R version for form.yml, use:
##
##   make r_versions
##
## and paste output into RStudio/form.yml.
if [[ ${r_version} == "default" ]]; then
    r_version=""
fi

module load CBI
module load r/"${r_version}"
module load rstudio-server-controller

# Start RSC launching the personal Rstudio Server instance
rsc start --port="${port:?}"

I can't really see what the problem is but my guess is, the r_version variable is not being passed somehow.

Do we want to set a specific version for default?

hgputnam commented 2 years ago

Digging into it...

hgputnam commented 2 years ago

Confirmed, r_version is not being passed. Time to RTFM and actually understand how the scoping works here...

$ cat output.log 
Script starting...
Waiting for RStudio Server to open port 26792...
Rversion =     <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver.pid). This file was removed, because it referred to a rserver process (PID 22806 on c4-n5) which is no longer running.
WARNING: Detected a stray PID file (/c4/home/hputnam/.config/rsc/rserver_monitor.pid). This file was removed, because it referred to a rserver_monitor process (PID 22846 on c4-n5) which is no longer running.
WARNING: Detected a stray lock file (/c4/home/hputnam/.config/rsc/pid.lock). This file was removed, because there was no evidence that another instance was running on this system (no PID files for neither 'rserver' nor 'rsession' were found).
Discovered RStudio Server listening on port 26792!
Generating connection YAML file...
hputnam, your personal RStudio Server is available on <http://c4-n12:26792>. If you are running from a remote machine without direct access to c4-n12, you can use SSH port forwarding to access the RStudio Server at <http://127.0.0.1:8787> by running 'ssh -L 8787:c4-n12:26792 hputnam@<login-machine>' in a second terminal.
Any R session started times out after being idle for 120 minutes.
WARNING: You now have 10 minutes, until 2022-05-03 14:08:07-07:00, to connect and log in to the RStudio Server before everything times out.
WARNING: The RStudio Server (PID 13009) will time out in 10 minutes on 2022-05-03 14:09:25-07:00, unless a new R session is started
HenrikBengtsson commented 2 years ago

I was worried about that; the r_version in the form.yml file is probably a non-envvar used by OnDemand. Probably have to find a way to have OnDemand to export that as an environment variable.

Add

echo "r_version='${r_version}'"
if [[ ${r_version} == "default" ]]; then
    r_version=""
fi

to simplify troublehooting. You could even do:

echo "r_version='${r_version:?}'"

which triggers an error if r_version is not set, i.e. sparing effort of launch rsc.

HenrikBengtsson commented 2 years ago

Ah, could it be that r_version is passed to RStudio/submit.yml.erb, but is lost there, because it is not passed on by the sbatch ..., so it never reaches any of the RStudio/template/*.sh.erb scripts?

HenrikBengtsson commented 2 years ago

I wonder if one could pass it as an argument to the job script (that I think is dynamically generated), `.e.

  native:
    - "-N 1"
    - "--ntasks"
    - "<%= slurm_cores.blank? ? 1 : slurm_cores.to_i %>"
    - "--mem"
    - "<%= slurm_mem.blank? ? 1 : slurm_mem.to_i %>gb"
    - "--gres"
    - "<%= slurm_scratch.blank? ? 1 : "scratch:" + slurm_scratch.to_s %>g"
    - "<%= r_version.to_s %>"

and then update RStudio/template/script.sh.erb to use:

## R version is passed as the first argument (optional)
r_version="${1:-default}"

if [[ ${r_version} == "default" ]]; then
    r_version=""
fi

...
hgputnam commented 2 years ago

I think it's a lack of ruby knowledge on my part. Looking at submit.yml.erb (the one place I know form.yml variables are passed, I see unfamiliar variable syntax, for instance slurm_mem is defined in form.yml like this:

  slurm_mem:
    widget: "number_field"
    max: 256
    min: 1
    step: 1
    value: 1
    label: "Memory (GiB)"
    help: "Enter a value between 1 and 256 (GiB)"

It gets passed to submit.yml.erb like this:

<%= slurm_mem.blank? ? 1 : slurm_mem.to_i %>gb"

I hard coded the "gb". I believe the <%= > is standard embedded ruby variable syntax. I think the 1 is a default. There are clearly methods associated with the numeric widget type, guessing that is what .to_i means??

I am experimenting with trying to reference like the above. I don't know where to go for looking up methods associated with widget types...

hgputnam commented 2 years ago

I wonder if one could pass it as an argument to the job script (that I think is dynamically generated), `.e.

Yeah, that is not clear to me. I don't think we want it in the submit.yml.erb, AFAIK that is strictly for issuing the sbatch command. I think all that stuff just gets glued in after sbatch and before the generated script.sh.

hgputnam commented 2 years ago

I will probably just post a lame question on their discourse...

hgputnam commented 2 years ago

First thing on the user form docs page:

The configuration file form.yml is responsible for:

defining all the attributes used throughout the various ERB files (defined as the session context) that make up the Interactive App . . .

I take that to mean those variables should be available to script.sh.erb/

HenrikBengtsson commented 2 years ago

(resubmit; somehow I've managed to edit your comments instead of submitting a new comment; second time today)

... I take that to mean those variables should be available to script.sh.erb/

Oh, I think I get it. We can inject it as part of the Ruby preprocessing step that generates the final script, i.e. try with:

#!/usr/bin/env bash

#shellcheck source=before.sh.erb

## The version of R entered in the OnDemand form.
## To find available R version for form.yml, use 'make r_versions'
## and paste output into RStudio/form.yml.
ONDEMAND_R_VERSION="<%= r_version.to_s %>"

## Environment variable 'ONDEMAND_R_VERSION' is optional.
## If set, it should specify one of the available R version
## in the CBI stack.  If not set, or "default", then it will
## use Lmod's default version of R.
if [[ ${ONDEMAND_R_VERSION} == "default" ]]; then
    ONDEMAND_R_VERSION=""
fi

module load CBI
module load r/"${ONDEMAND_R_VERSION}"
module load rstudio-server-controller

# Start RSC launching the personal Rstudio Server instance
rsc start --port="${port:?}"
hgputnam commented 2 years ago

I posted a really lame question on discourse: https://discourse.openondemand.org/t/help-passing-form-yml-variable-to-script-sh-erb-for-interactive-application/2038

I did find that our select widget was mis-configured based on examples on discours. I fixed that. The big problem now is getting the form variable passed into either script.sh.erb or before.sh.erb. I tried this:

script.sh.erb

#!/usr/bin/env bash

#shellcheck source=before.sh.erb

## The version of R to run according to OnDemand form.
## Environment variable 'ONDEMAND_R_VERSION' is optional.
## If set, it should specify one of the available R version
## in the CBI stack.  If not set, or "default", then it will
## use Lmod's default version of R.
##
## To find available R version for form.yml, use:
##
##   make r_versions
##
## and paste output into RStudio/form.yml.

r_version="<%= r_version %>"

if [[ ${r_version} == "default" ]]; then
    r_version=""
fi

module load CBI
module load r/"${r_version}"
module load rstudio-server-controller

# Start RSC launching the personal Rstudio Server instance
rsc start --port="${port:?}"

One of the things I am really confused about is where to get information about these methods. For instance, with this: ONDEMAND_R_VERSION="<%= r_version.to_s %>" How did you know about .to_s?

hgputnam commented 2 years ago

BTW, the above results in this error being passed to the web page when I hit submit:

Failed to stage the template with the following error:

undefined local variable or method `r_version' for #<BatchConnect::Session::TemplateBinding:0x00007f248535da80>

    The RStudio Server session data for this session can be accessed under the [staged root directory](https://c4-ondemand2.ucsf.edu/pun/sys/files/fs/c4/home/hputnam/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/34c2032f-22dd-4730-baed-2e79f753f571).
HenrikBengtsson commented 2 years ago

I did find that our select widget was mis-configured based on examples on discours. I fixed that.

Are you referring to the options need two and not one element? I was probably tired, because I'm pretty sure I saw an example online that suggested if one use a single one, then the second value default to the first. Oh well ...

For instance, with this: ONDEMAND_R_VERSION="<%= r_version.to_s %>" How did you know about .to_s?

From https://github.com/UCSF-CBI/c4-ondemand-interactive-apps/blob/e49f5af3e5074ed1b050f1a08d1f946ee9b0eb31/RStudio/submit.yml.erb#L14. But seeing that, I have a vague recollection of having to use that in some other Ruby code way back when.

HenrikBengtsson commented 2 years ago

From https://osc.github.io/ood-documentation/latest/app-development/interactive/template.html#context:

The context object is an object that holds all the attributes defined by the User Form. … For example, if you want to perform arithmetic on the bc_num_hours attribute that a user filled out in the Interactive App form:

export SECONDS=<%= context.bc_num_hours.to_i * 3600 %>

So, retry with <%= context.r_version.to_s %>.

hgputnam commented 2 years ago

So, retry with <%= context.r_version.to_s %>.

Exactly the answer Bob Ostrohm suggested on discourse...

But seeing that, I have a vague recollection of having to use that in some other Ruby code way back when.

Yeah, but slurm_scratch is a string object. I noticed that numeric object have a similar method but called .to_i Just assumed that since the select items were stings that we should do the same? I would love to find documentation for those form.yml object types somewhere...

hgputnam commented 2 years ago

Yay it works!


> R.version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          0.2                         
year           2020                        
month          06                          
day            22                          
svn rev        78730                       
language       R                           
version.string R version 4.0.2 (2020-06-22)
nickname       Taking Off Again            

Shall I push the change to the other server and let the beta group know?

hgputnam commented 2 years ago

Actually, let me update documentation first...

HenrikBengtsson commented 2 years ago

Awesome! Now when we have a working prototype, I have a code cleanup I'd like to commit that will make things neater going forward. Let me know when you've committed and pushed what you've got this far.

hgputnam commented 2 years ago

Ok, go ahead. I pulled to c4-ondemand1 but did not deploy. Almost done with docs...

HenrikBengtsson commented 2 years ago

Done, cf. commit 38ddc921

HenrikBengtsson commented 2 years ago

Forgot to say, see https://github.com/OSC/ood-documentation/issues/674#issuecomment-1117338062 for why this way of setting options works.

hgputnam commented 2 years ago

I see. That Jeff (not Bob) guy is super helpful, he has helped me a few time now and I think he kind of runs things for OOD project.

hgputnam commented 2 years ago

I will notify the beta team tomorrow. I need to spend some quality time with our new switches...

HenrikBengtsson commented 2 years ago

Did you deploy? I don't see it so can't test it myself.

FYI, I wanna announce R 4.2.0, but this one is a blocker, since I want to upgrade module load r to default to R 4.2.0.

hgputnam commented 2 years ago

I was going to wait until announcing it. You can test from c4-ondemand2 which users do not yet know about. All changes are deployed there. https://c4-ondemand2.ucsf.edu

HenrikBengtsson commented 2 years ago

Confirmed with a few different versions that it works.

I say, deploy first. At this point, nothing changes from the OnDemand user's point of view. Some will notice before your announcement.

Now, we need update the default r module to be r/4.2.0, which means users have to be informed because some will be surprised having to re-install R packages etc. I have a ready-to-go Slack announcement for the new R version, that covers instructions on package updates, the Bioconductor release coupled with R 4.2.0, etc. This needs to go out as soon as R 4.2.0 is the new default. After that announcement has gone out, you could inform OnDemand users (and/or I could add it to that Slack message)

HenrikBengtsson commented 2 years ago

I also got a ready-to-go update to https://c4.ucsf.edu/howto/r.html for the new R 4.2.0.

hgputnam commented 2 years ago

OK, fully deployed on both ondemand servers.

HenrikBengtsson commented 2 years ago

Thanks, done: