rocker-org / rocker-versioned2

Run current & prior versions of R using docker. rocker/r-ver, rocker/rstudio, rocker/shiny, rocker/tidyverse, and so on.
https://rocker-project.org
GNU General Public License v2.0
414 stars 180 forks source link

Error running rocker/binder:4.x images #88

Closed nuest closed 2 years ago

nuest commented 4 years ago

I have troubles running the rocker/binder images for R 4.x.

Here is a minimal example repo, based on the README from https://hub.docker.com/r/rocker/binder

https://github.com/nuest/rocker-binder-test

Binder

Problem: When I go to "New > RStudio" I do not get RStudio, but am redirected to a weird looking URL with an error message "requested page was not found":

https://hub-binder.mybinder.ovh/user/nuest-rocker-binder-test-g3uxb244/rstudio/https,http:/hub-binder.mybinder.ovh/auth-sign-in

image

I observe a similar behaviour when running rocker/binder:4.0.1, and rocker/binder:4.0.2 locally with repo2docker where I'm redirected from http://127.0.0.1:37847/tree to http://127.0.0.1:37847/rstudio to http://127.0.0.1:49189/auth-sign-in

and get a 404: not Found error.

image

These are the logs:

    To access the notebook, open this file in a browser:
        file:///home/rstudio/.local/share/jupyter/runtime/nbserver-1-open.html
    Or copy and paste this URL:
        http://127.0.0.1:49189/?token=9f17f63b2e9c5cbfc985846287db03928071ff9ad3995a9c
[I 11:52:17.279 NotebookApp] 302 GET /?token=9f17f63b2e9c5cbfc985846287db03928071ff9ad3995a9c (172.17.0.1) 0.91ms
[I 11:52:22.464 NotebookApp] 302 GET /rstudio/ (172.17.0.1) 908.96ms
[W 11:52:22.508 NotebookApp] 404 GET /auth-sign-in (172.17.0.1) 19.66ms referer=http://127.0.0.1:49189/tree

Anyone an idea what's going wrong here?

cboettig commented 4 years ago

I think this may be related to #78 but not sure... we'll see if the new RStudio is released before I get to investigate properly!

nuest commented 4 years ago

It could be. I do recall seeing the spinner as well, but only once, so I didn't properly report it.

:+1: makes sense to see if the release fixes things. Is there an ETA for the release? I did not find it (might be too stupid though).

Will all Binder images receive the latest RStudio release?

I tried to pull the latest daily in my own Dockerfile for now, but it's not as simple as I hoped, getting the error

(Reading database ... 59423 files and directories currently installed.)
Preparing to unpack rstudio-server-1.4.869-amd64.deb ...
Unpacking rstudio-server (1.4.869) over (1.3.1093) ...
Setting up rstudio-server (1.4.869) ...
useradd: user 'rstudio-server' already exists
useradd: group rstudio exists - if you want to add this user to that group, use -g.
chpasswd: (user rstudio) pam_chauthtok() failed, error:
Authentication token manipulation error
chpasswd: (line 1, user rstudio) password not changed
chown: invalid user: ‘rstudio:rstudio’
addgroup: The user `rstudio' does not exist.
chown: invalid user: ‘rstudio:rstudio’
chown: invalid user: ‘rstudio:rstudio’

I'm happy to wait a bit though for the problem to solve itself.

cboettig commented 4 years ago

thanks @nuest , looks like the image is a bit out of date (could be an error on my end since I was testing against my local build). Yup, everything should percolate once RStudio is released. I do mean to sink some time into the binder setup (we're trying to standardize things wrt python setup across binder, the ml images, etc so things are consistent) but unfortunately a lot of moving parts still in the air.

nuest commented 4 years ago

Thank you for the update. I cannot promise much help, this new stack does look quite impressive.

I'm surely up for testing though, happy to build and run things locally. Ping me here or directly, if that's helpful.

The core Binder team seems to have given up on R a little bit, or just focussing on Conda. So I want rocker/binder to rock :-).

cboettig commented 4 years ago

yeah, feel free to poke around. one of the core tricks that the binder image does is via the pip package, jupyter-rsession-proxy, which runs the rsession without rserver, i.e. runs rstudio without root (something that would be very relevant / handy for applications in podman, kubernetes and singularity, all of which stumble over the root issue. So I'm a little nervous that the rstudio release changed things that broke that trick, but not sure. I tried mimicking the commands jupyter-rsession-proxy was using to run rsession directly here https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/rsession.sh, but so far that doesn't seem to work.

Our campus-hosted system uses this recipe: https://github.com/berkeley-dsep-infra/datahub/blob/staging/deployments/r/image/Dockerfile, which also uses jupter-rsession-proxy and I believe is impacted by the current RStudio disable-auth bug, and has been confirmed to work with the new RStudio release, which is why I'm somewhat hopeful that the problem for binder is the same and that the new release will restore binder to working order.

I believe @ryanlovett is responsible for much of the magic in jupyter-rsession-proxy and non-root execution, but I never did figure out the details.

ryanlovett commented 4 years ago

Hi @cboettig , jupyter-rsession-proxy has gone back and forth between running rsession and rserver, but it now runs rserver. There is still code to run rsession, but there's no entry point for it. And there's no magic to run as non-root. :) There's a pending PR to help it work better as non-root on multi-user systems, but it shouldn't have any trouble in containers.

cboettig commented 4 years ago

Thanks @ryanlovett , this is super helpful. Using the rstudio daily, you're totally right that I can run rserver as a normal user and still log in. For some reason trying the same thing with the current stable rstudio release throws an error about secure cookies at me instead. Anyway, this is definitely handy, looking forward to exploring more.

cc @noamross maybe we should explore some non-root defaults for the rstudio container? Perhaps we could drop the s6-init system by default, leave it as an opt-in configuration for setups where you need it..... thoughts?

cboettig commented 4 years ago

@ryanlovett the permissions issues here really confuse me. it seems strange (wrong?) that I can run rserver as the user "rstudio" and then log into the RStudio server on the browser as a different user (if the container has multiple users configured) ... but it works! (unless I'm really confused). Don't mean to bug you about things I should really ask the RStudio team, but is that what you observe too? Does that not seem weird?

rokroskar commented 4 years ago

We are also experiencing this problem in images we are building for the Renku platform - in May using the same rocker base image (rocker/verse:4.0.0-ubuntu18.04) seemed to work without problem, but now we get this same strange broken redirect. It looks like you overwrite the tags in the main docker hub repository, but is there some place where tagged versions are preserved so we could build our images against the same 4.0 image we used a few months back?

FWIW, I really don't know what to make of this behavior, but after trying several times the redirect eventually self-corrects and I get to a running RStudio session as expected.

cboettig commented 4 years ago

@rokroskar apologies, but can you be more specific? The title issue is about the binder images, and you're not running those images. What problem are you experiencing?

The ubuntu18.04 images aren't being actively developed at this time, we would really encourage users to use the more recent images based on ubuntu 20.04 instead (e.g. rocker/verse.

Note that at least on the new images, you should be able to adjust the RStudio version relatively easily by setting the env var RSTUDIO_VERSION and re-running the /rocker_scripts/install_rstudio.sh script.

rokroskar commented 4 years ago

Hi @cboettig sure, sorry for being vague - the problem we are seeing is this redirect to a broken URL when trying to reach the /rstudio path in a jupyter server running the jupyter-rsession-proxy. Our images are very similar to the binder set up.

I was able to fix this in the newer images with the rstudio install script as you suggest, but the question remains - is there another docker repository where you push versioned images that can be used to pin our own builds against? In this case the solution was relatively simple, but if something breaks deeper in one of the dependencies it might be really hard to fix and it means that we cannot reliably make new builds of our own that depend on the rocker images.

cboettig commented 4 years ago

Thanks @rokroskar , I follow now. Yes, we intend to pin versions of everything, including RStudio version, in the older containers so that they don't update and break on you. We missed that on the image you mentioned, see https://github.com/rocker-org/rocker-versioned2/blob/master/stacks/core-4.0.0-ubuntu18.04.json#L28, which should be adjusted.

R 4.0.0 was released on April 24, and so should be pinned to RStudio 1.2.5042-1. fortunately I think that's the last version where RStudio hadn't broken the disable_auth option in #78. It's fixed in the dailies but the 1.4.x RStudio release isn't out yet, so you could set RStudio version to daily instead and that should also resolve the problem.

Apologies we missed the version pinning on the RStudio version. We're a community project here and rely on community reports to catch issues, so thanks again for reporting it! I'll rebuild pinned images now.

(Contextual note: this issue is simply related to the RStudio bug, #78, and not related to the issue of running rserver under non-root settings).

cboettig commented 4 years ago

ick, apologies, my math was wrong. 4.0.0 was frozen on the release of 4.0.1, and so should be pinned to 1.3.959-1.

rokroskar commented 4 years ago

Aha ok, makes sense now thanks for the explanation! We'll shift to the newer (ubuntu 20.04-based) images soon anyway so hopefully we can stay current that way.

Robinlovelace commented 3 years ago

In case it's of use + interest, I just updated the rocker/binder version used in the robinlovelace/geocompr repo to the latest, as per above, but binder still fails: https://mybinder.org/v2/gh/robinlovelace/geocompr/master?urlpath=rstudio

Sad because binder used to work really, really well for teaching. Like @nuest I'm happy to help if I can.

Robinlovelace commented 3 years ago

FWIW the url in the image looks suspicious: https://hub-binder.mybinder.ovh/user/robinlovelace-geocompr-x16nre90/rstudio/https,http:/hub-binder.mybinder.ovh/auth-sign-in?appUri=%2F%3Ftoken%3D6RbyQh4FQn6RyYAajPVo4w

image

Robinlovelace commented 3 years ago

Not my area but looks like a redirection issue. The image is there and I can click on the link to launch RStudio but when I do it goes to that weird URL shown above:

image

rokroskar commented 3 years ago

Hi @Robinlovelace pinning the RStudio version to 1.2.5042 fixed the problem for us. The relevant change is here. Hope that helps!

vnijs commented 3 years ago

I can confirm that 1.3.959 is the last version of Rstudio that works with jupyter-rsession-proxy. I just tried the binder docker file with the preview version of Rstudio and the current daily (1.4.996) and both give the error below.

image

FROM rocker/geospatial:4.0.3

LABEL org.label-schema.license="GPL-2.0" \
    org.label-schema.vcs-url="https://github.com/rocker-org/rocker-versioned" \
    org.label-schema.vendor="Rocker Project" \
    maintainer="Carl Boettiger <cboettig@ropensci.org>"

ENV RSTUDIO_VERSION 1.4.993
RUN /rocker_scripts/install_rstudio.sh

ENV NB_USER=jovyan

RUN /rocker_scripts/install_python.sh
RUN /rocker_scripts/install_binder.sh

CMD jupyter notebook --ip 0.0.0.0

USER ${NB_USER}

WORKDIR /home/${NB_USER}
Robinlovelace commented 3 years ago

Heads-up @rokroskar you advice fixed the issue, many thanks!

See commits here for changes:

https://github.com/Robinlovelace/geocompr/issues/570

And here for the working RStudio version - will be great to get it working on the latest version of RStudio again.

https://mybinder.org/v2/gh/robinlovelace/geocompr/master?urlpath=rstudio

rokroskar commented 3 years ago

@Robinlovelace great, glad that helped! I saw somewhere a mention that it also works with some 1.3.x versions of RStudio - did you happen to try?

Robinlovelace commented 3 years ago

@Robinlovelace great, glad that helped! I saw somewhere a mention that it also works with some 1.3.x versions of RStudio - did you happen to try?

No not tried. The fewer changes the better from my perspective so waiting for the released/dev versions to work again and happy to help make that happen. All this sharing will help minimise the fall-out from this issue.

vnijs commented 3 years ago

@rokroskar 1.3.959 is the last version that I have found to work

cboettig commented 3 years ago

Thanks @vnijs , I'm going to pin RStudio version to 1.3.959 in the install_binder.sh script, which will downgrade binder:4.0.0 - binder:latest to that RStudio version.

Pretty annoying that even the RStudio dailies aren't working here, I'm not sure quite why that is but I can reproduce the same 500 error. Given the version issue, I suspect there is still a bug in the current RStudio implementation that needs to be tracked down. Seems unrelated to the previous bug that was also blocking --disable_auth.

vnijs commented 3 years ago

@cboettig I assumed that the PR that fixed the --disable_auth issue also introduced the issue with jupyter-rsession-proxy

https://github.com/rstudio/rstudio/pull/7628

Robinlovelace commented 3 years ago

Heads-up @cboettig this sounds like it may be a future proof solution for the daily version:

On 1.4, it's possible to pass --www-root-path=/rstudio to rserver so it takes care of redirecting to the right URL path for rocker/binder.

From the discussion linked to above.

cboettig commented 3 years ago

Thanks @Robinlovelace , I hadn't seen that. I think it will require a change to jupyter-rsession-proxy, hopefully they can add an ability to let us set that with an environmental variable.

Robinlovelace commented 3 years ago

Happy new year everyone! Just checking up on this issue, potentially an important one for reproducibility, so keen to help out. Any way I can help with this issue @cboettig ?

Robinlovelace commented 3 years ago

Heads-up @cboettig, following various threads I've come across this: https://github.com/kubeflow/kubeflow/pull/5570/files

Apparently it works. Hope that helps!

cboettig commented 3 years ago

@Robinlovelace thanks, yes! @DavidSpek also mentioned this over at https://github.com/rocker-org/rocker-versioned2/issues/90#issuecomment-770430144. I'm hoping we'll see this as an upstream fix, e.g. https://github.com/jupyterhub/jupyter-rsession-proxy/issues/95

eitsupi commented 2 years ago

This issue seems resolved by #309. rocker/binder:latest (rocker/binder:4.1.2) seems work correctly now. https://github.com/rocker-org/rocker-versioned2/wiki/binder_6fac4d6e57c3

eitsupi commented 2 years ago

I think this issue has been resolved, so I close it.