Closed ablekh closed 4 years ago
You might be interested in nbrsessionproxy, a package that adds RStudio sessions to Jupyter (and thereby JupyterHub).
@minrk Thank you for your prompt feedback and for reminding me about nbrsessionproxy
. I meant to use it, but somehow completely forgot about it (I guess, I became too excited about a possibility of using a pre-built Rocker image without installing additional dependencies :-). I assume that, upon selecting the RStudio Session
in the menu, user will face a standard RStudio UI, similarly as Binder presents it.
If I understand correctly, it appears that, in order to use nbrsessionproxy
, essentially I would have to build a custom Docker image on top of one of the standard (or our custom) Jupyter images. Correct?
I'm still quite curious about my original approach and the above-mentioned error. Do you have thoughts about what was going on and advice on how to fix it. Would my idea on using jupyter-repo2docker
work?
@ablekh I know it is a bit complicated but possible to get rstudio working alongside jupyterlab and various kernels.
I'm interested in finding the minimal steps needed to do this. I would not recommend using jupyter-repo2docker in order to get a z2jh image, but instead attempt do base a new image from the jupyter/docker-stack repo's images. Speaking of which, perhaps one of them already has RStudio ready for use?
UPDATE: the datascience-notebook, built on top of r-notebook, did not have RStudio installed.
QUESTION: what is needed on top of a datascience-notebook, in order to be able to access the /rstudio
endpoint alongside the /lab
endpoint?
@consideRatio Thank you for your continued support. :-)
The goal for this effort is to have a full-featured (as Binder does) multi-user RStudio environment, using all benefits of a K8s-based containerized JupyterHub setup. Unfortunately, jupyter/docker-stacks
project does not offer RStudio-enabled images, hence my attempt to use the Rocker ones. I have some experience with creating custom JupyterHub images, based on docker-stacks
, but I have always used jupyter-repo2docker
to build those images. I'm curious about why you're recommending against it. What are the problems with this tool? If not using it, how would you go about building a custom image, based on docker-stacks
?
@ablekh oh thank you for investigating and sharing all kinds of knowledge and experience!
It is my understanding that repo2docker is meant to build for a specific repo, and that a specific repo is meant to run under one rather than multiple different kernels, which a typical z2jh image may want to have.
Does repo2docker utilize the docker-stacks images as a base image? Is that how you are utilizing them (indirectly) while building using repo2docker as a tool?
Beware that I'm only using my shallow knowledge to recommend against using repo2docker specifically for creating images for use with a z2jh deployment if you want to make a non-repo specific image for general use.
A friend of mine has built a huge docker image with RStudio enabled on top of a docker-stacks base image, so it is possible. I have not yet learned how to do it, but I recall that nbrsessionproxy was essential as @minrk suggested. Will get back to you if I set this up, it is something I'd like to do but doesn't have a high priority in comparison to other tasks atm.
@consideRatio My pleasure. Thank you for sharing your knowledge and experience as well. I will very much appreciate any further help/advice from you and others in this nice community. In the meantime, I will try to use nbrsessionproxy
(after getting some sleep - was up all night mostly working) and will share findings.
Re: my experience using jupyter-repo2docker
- I have used this tool to successfully build a custom image, adding Octave kernel on top of the datascience-notebook
image source from the jupyter-docker-stacks
project, in order to simultaneously support Python, R, Julia and Octave.
QUESTION: what is needed on top of a datascience-notebook, in order to be able to access the /rstudio endpoint alongside the /lab endpoint?
This dockerfile will install rstudio and install & enable nbrsessionproxy:
FROM jupyter/r-notebook # or datascience-notebook
# install nbrsessionproxy extension
RUN conda install -yq -c conda-forge nbrsessionproxy && \
conda clean -tipsy
# install rstudio-server
USER root
RUN apt-get update && \
curl --silent -L --fail https://download2.rstudio.org/rstudio-server-1.1.419-amd64.deb > /tmp/rstudio.deb && \
echo '24cd11f0405d8372b4168fc9956e0386 /tmp/rstudio.deb' | md5sum -c - && \
apt-get install -y /tmp/rstudio.deb && \
rm /tmp/rstudio.deb && \
apt-get clean
ENV PATH=$PATH:/usr/lib/rstudio-server/bin
USER $NB_USER
If you already have an image with rstudio server, then just installing nbrsessionproxy should be enough, following the installation instructions, either with pip or conda.
Wow thanks @minrk !!
@minrk Thank you very much for your advice. Will definitely try this approach. I still hope to hear your / others' opinion on my original (Rocker) approach, which IMO should work (perhaps, with some changes).
BTW, how do you recommend to build the image (based on the Dockerfile you shared above): using jupyter-repo2docker
or some other method (I guess, by simply using docker
command)? As you can see from @consideRatio's and my recent comments here, we have quite different experiences in this regard ...
UPDATE: Hey, folks! Just wanted to let everyone know that, based on @minrk's advice (thanks again!; the nbrsessionproxy
approach), I was able to successfully build relevant image (using jupyter-repo2docker
) as well as configure and run RStudio on our separate AKS-based JupyterHub cluster (actually, it was yesterday afternoon - sorry about delaying the update). There were some arguably AKS-specific issues (which drove me slightly crazy :-), but they were either fixed by me or went away over time.
For the sake of completeness, I want to say that, while this nbrsessionproxy
-based approach is nice, it is well-suited for multi-kernel JupyterHub implementations (which, if I can guess, most likely represent more than 95% of all JH deployments). For the rest of deployments that have special requirements, such as, in this case, my preference for RStudio-only deployment (a la relevant Binder's example deployment), I suspect that my original approach (or, likely, its modification) would be much more appropriate. For this particular case, perhaps, there is a compromise-based solution (e.g., via JH configuration) that would allow an immediate redirect in-place (in the same tab) after authentication to user's RStudio session endpoint, without opening the standard JH UI and having to select the RStudio Session menu item to open RStudio UI in a new tab.
As I've said, I'm still curious about my original (Rocker image-based) approach [see my initial comment in this thread], so if someone would like to share their thoughts on this, I would be delighted to hear them.
@ablekh I believe you can configure JupyterHub with c.Spawner.args = ['--NotebookApp.default_url=/rstudio']
to get the behavior you want.
@ryanlovett Thank you so much for this advice. I was pretty sure that it is possible, but wasn't sure what configuration option is responsible for that behavior (opening custom endpoint instead of default one). Using this opportunity, I'd like to thank you for creating nbrsessionproxy
as well as your help in general.
P.S. BTW, do you know by any chance why username in RStudio sessions remain default jovyan
instead of expected one, based on GitHub authentication? When building RStudio-enabled custom image, in the end I noticed a warning about NBUSER or such not being used (wording was different, but not essence). I suspect that this what causes this minor issue. Also, what do you think about my original Rocker-based approach?
Which of the following methods for implementing @ryanlovett's advice (see above) is correct (or better):
# method 1
singleuser:
defaultUrl: "/rstudio"
# method 2
hub:
extraConfig: |-
c.KubeSpawner.singleuser_image_pull_secrets = "<SECRET_NAME>"
c.Spawner.args = ['--NotebookApp.default_url=/rstudio']
Not looked into the details, but I would go with the defaultUrl value.
@ablekh PS: the singleuser_
prefix is deprecated, writing c.KubeSpawner.image_pull_secrets = "<SECRET_NAME>"
is the current recommended practice.
@consideRatio Thanks much for your advice on both aspects. If you still need any help with creating custom RStudio-focused Docker images, please let me know and I will do my best to help ...
Just updated my RStudio-focused deployment's config.yaml
and upgraded the cluster. The defaultUrl
option worked, directly opening RStudio UI upon authentication, however the following error message gets produced in RStudio's console window. Any thoughts?
24 Oct 2018 08:19:26 [rsession-jovyan] ERROR session hadabend; LOGGED FROM: rstudio::core::Error {anonymous}::rInit(const rstudio::r::session::RInitInfo&) /home/ubuntu/rstudio/src/cpp/session/SessionMain.cpp:563
Note to self: in non-default UI environments like this, JupyterHub's Control Panel still can be accessed at /hub/home
endpoint (for administrative stuff, if enabled, see Admin menu item (/home/admin
endpoint).
BTW, do you know by any chance why username in RStudio sessions remain default jovyan instead of expected one, based on GitHub authentication?
The default docker-stack images can switch the username to $NB_USER
(and also $NB_UID
$NB_GID
if you want), but you need to run the image as root, e.g. see start.sh: https://github.com/jupyter/docker-stacks/blob/f2889d7ae7d6a4a404169b985f2f2ca421f388a1/base-notebook/start.sh#L47
You could try something similar in your image?
@manics Thank you very much for this advice. However, I'm not sure what level / step of the deployment workflow do you mean, when talking about running an image as root
. Could you clarify this for me?
Just dug out my config and it's a bit more complicated than I thought. Setting singleuser.uid: 0
to start the singleuser server as root is the easy bit, but you need to pass extra information (GitHub username) from the spawner to jupyter. I've got a test system working with LDAP (with @consideRatio's help):
hub:
extraConfig: |
...
Lots of extra stuff
....
class LDAPAuthenticatorInfoUID(LDAPAuthenticatorInfo):
@gen.coroutine
def pre_spawn_start(self, user, spawner):
auth_state = yield user.get_auth_state()
self.log.error('pre_spawn_start auth_state:%s' % auth_state)
if not auth_state:
return
# setup environment
spawner.environment['NB_UID'] = str(
auth_state['uidNumber'][0])
spawner.environment['NB_USER'] = auth_state['uid'][0]
This required a modified version of LDAPAuthenticator to fetch the extra info: https://github.com/jupyterhub/ldapauthenticator/pull/103 but it looks like the GitHub authneticator already includes the required fields: https://github.com/jupyterhub/oauthenticator/blob/0.8.0/oauthenticator/github.py#L162
@manics I see. Hmm, interesting ... I appreciate your help. However, I'm a bit confused by your example - should it be reworked into something like a subclass of class GitHubOAuthenticator(OAuthenticator)
or there is a way to simply (without lots of code) pass already captured GitHub name value via extraConfig
?
Yes, you effectively create the authenticator subclass in extraConfig instead of building a custom image. Since GitHubOAuthenticator
already passes the github username it should be fairly easy, something like this might work (I haven't tried it):
singleuser:
uid: 0
hub:
extraConfig: |
class CustomGitHubOAuthenticator(GitHubOAuthenticator):
@gen.coroutine
def pre_spawn_start(self, user, spawner):
auth_state = yield user.get_auth_state()
self.log.info('pre_spawn_start auth_state:%s' % auth_state)
if not auth_state:
return
# setup environment
spawner.environment['NB_USER'] = auth_state['github_user']
c.JupyterHub.authenticator_class = LDAPAuthenticatorInfoUID
auth:
state:
enabled: True
cryptoKey: SECRET-KEY
PS ping me in Gitter if you want
@ablekh No problem, I'm happy the extension has been useful for others. :)
If you need the container to have NB_USER set to be the same as what your authenticator provides, @manics solution looks like the right approach to me. We do the same when we need to slightly alter hub behavior.
Also, what do you think about my original Rocker-based approach?
I'm not too familiar with the Rocker images but you can either start with an R/RStudio based image and add Jupyter+JupyterHub support or the other way around. In general I think images should be tailored to your use case and have less extraneous components (unless the point of your user environment is to expose people to a broad set of packages). We create our own images for data science courses at Cal so that they have just the right mix of packages.
@ryanlovett Thank you for additional clarifications. I will further try Rocker approach when I get a chance.
As for the solution for the username suggested by @manics, I have implemented it earlier today, but I am still getting a 500 error. His initial advice was producing some errors, which I was able to figure out (missing import
statement and --allow-root
parameter for the spawner). However, after all those issues were fixed, the most recent error message looks like the following (note the rsession
-related lines). Any thoughts?
[W 2018-10-24 12:05:01.773 SingleUserNotebookApp configurable:168] Config option `open_browser` not recognized by `SingleUserNotebookApp`. Did you mean `browser`?
[I 2018-10-24 12:05:02.093 SingleUserNotebookApp extension:59] JupyterLab extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 2018-10-24 12:05:02.093 SingleUserNotebookApp extension:60] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2018-10-24 12:05:02.103 SingleUserNotebookApp singleuser:406] Starting jupyterhub-singleuser server version 0.9.4
[I 2018-10-24 12:05:02.112 SingleUserNotebookApp notebookapp:1712] Serving notebooks from local directory: /home/jovyan
[I 2018-10-24 12:05:02.112 SingleUserNotebookApp notebookapp:1712] The Jupyter Notebook is running at:
[I 2018-10-24 12:05:02.113 SingleUserNotebookApp notebookapp:1712] http://(jupyter-ablekh or 127.0.0.1):8888/user/ablekh/
[I 2018-10-24 12:05:02.113 SingleUserNotebookApp notebookapp:1713] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2018-10-24 12:05:04.629 SingleUserNotebookApp log:158] 302 GET /user/ablekh/ -> /user/ablekh/rstudio? (@10.244.0.43) 0.97ms
[I 2018-10-24 12:05:04.864 SingleUserNotebookApp log:158] 302 GET /user/ablekh/?redirects=1 -> /user/ablekh/rstudio?redirects=1 (@10.244.0.1) 0.65ms
[I 2018-10-24 12:05:04.945 SingleUserNotebookApp log:158] 302 GET /user/ablekh/rstudio?redirects=1 -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-ablekh&redirect_uri=%2Fuser%2Fablekh%2Foauth_callback&response_type=code&state=[secret] (@10.244.0.1) 2.49ms
[I 2018-10-24 12:05:05.519 SingleUserNotebookApp auth:875] Logged-in user {'kind': 'user', 'name': 'ablekh', 'admin': True, 'groups': [], 'server': '/user/ablekh/', 'pending': None, 'created': '2018-10-23T17:44:50.775270Z', 'last_activity': '2018-10-24T12:05:05.467083Z', 'servers': None}
[I 2018-10-24 12:05:05.522 SingleUserNotebookApp log:158] 302 GET /user/ablekh/oauth_callback?code=[secret]&state=[secret] -> /user/ablekh/rstudio?redirects=1 (@10.244.0.1) 325.18ms
[I 2018-10-24 12:05:05.584 SingleUserNotebookApp log:158] 302 GET /user/ablekh/rstudio?redirects=1 -> /user/ablekh/rstudio/?redirects=1 (ablekh@10.244.0.1) 1.00ms
[I 2018-10-24 12:05:05.644 SingleUserNotebookApp handlers:439] No existing rsession found
[I 2018-10-24 12:05:05.645 SingleUserNotebookApp handlers:391] Starting process...
[I 2018-10-24 12:05:05.657 SingleUserNotebookApp handlers:385] rsession died with code 0
[I 2018-10-24 12:05:06.656 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:08.059 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:10.019 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:12.767 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:16.613 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:21.998 SingleUserNotebookApp handlers:330] Process exited: rsession
[I 2018-10-24 12:05:29.532 SingleUserNotebookApp handlers:330] Process exited: rsession
[E 2018-10-24 12:05:40.085 SingleUserNotebookApp web:1670] Uncaught exception GET /user/ablekh/rstudio/?redirects=1 (10.244.0.1)
HTTPServerRequest(protocol='https', host='<FQDN>', method='GET', uri='/user/ablekh/rstudio/?redirects=1', version='HTTP/1.1', remote_ip='10.244.0.1')
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/web.py", line 1592, in _execute
result = yield result
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/nbserverproxy/handlers.py", line 96, in get
return await self.http_get(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/nbserverproxy/handlers.py", line 443, in http_get
return await self.proxy(self.port, path)
File "/opt/conda/lib/python3.6/site-packages/nbserverproxy/handlers.py", line 420, in proxy
await self.conditional_start()
File "/opt/conda/lib/python3.6/site-packages/nbserverproxy/handlers.py", line 440, in conditional_start
await self.start_process()
File "/opt/conda/lib/python3.6/site-packages/nbserverproxy/handlers.py", line 407, in start_process
proc.terminate()
AttributeError: 'Subprocess' object has no attribute 'terminate'
...
rsession died with code 0
It is hard to tell what happened other than rsession died. You can try to debug this by dropping to a Jupyter terminal and running rserver --www-port={some_num}
where some_num is a random TCP port, e.g. 50000. It could fail for any number of reasons depending on how the image was created. The simplest might be if rsession is not in your PATH in which case the terminal would complain command not found
.
@ryanlovett I appreciate your advice. Will try to figure this out after the 2-day workshop (ending today), for which I was creating this particular cluster. The more I think about it, the more this issue looks strange to me. Because the only differences between the failing environment and the working one are the changes described above (based on advice by @manics) - subclassing GitHubOAuthenticator
+ adding some missing import
statements + setting KubeSpawner's allow-root
parameter) - all singularly focused on correcting the username and IMO highly unlikely affecting the R environment. Of course, I realize that side effects do happen, but I just don't see them in this case. Or I'm completely missing something ... @minrk Any ideas?
An update on how to install RStudio, nbrsessionproxy
is now called jupyter-rsession-proxy
, and you may want to install that instead of nbressionproxy
? I'm not sure. Note that if you do, you uninstall the old nbressionproxy
first according to their instructions.
@ryanlovett et al Should we switch future https://github.com/rocker-org/binder images to jupyter-rsession-proxy
?
Hey all - since we have a few cases where RStudio works in JupterHub, can we focus this issue on "where / how to document this" and then close it once the documentation is in place?
If I do a site-search for "RStudio" in the z2jh guide, I don't see any actionable content about how to install RStudio, or links to other guides to install it. Can we insert that in there somewhere? And if so, where would be the best place? Maybe @ryanlovett or @consideRatio know of which resources were the most helpful?
@cboettig Sorry, I missed your earlier mention. Yes, jupyter-rsession-proxy is the way to go and all new development will take place there rather than nbrsessionproxy.
@choldgraf Though jupyter-*-proxy are most useful in a JupyterHub context, getting them into JupyterHub is mostly a matter of getting it into the single user environment. Do you think this should be documented in z2jh's "Customizing User Environment" ?
Fwiw, I think:
@ryanlovett yep, basically just what you said + some links to the rsessionproxy docs would probably work. Just enough information so that somebody that searches RStudio
would know where to go next
Do I understand correctly that currently enabling RStudio - JupyterHub integration via jupyter-rsession-proxy
is possible only by manually producing a custom single-user Docker image (as in example Dokerfile)?
I think that it would be nice to have Helm chart functionality (correct me if it's already there) that would allow admins to specify arbitrary extra commands (in this case, pip install
etc.), allowing to automatically build relevant custom image (on a master node) & push to a target registry for further use by a spawner.
@ablekh why not use something like repo2docker for this?
I agree that building in more environment building into JupyterHub could be helpful...though I feel like that might be a complex-enough topic that it'd warrant its own issue separate from RStudio-specifically. What do you think?
@choldgraf Certainly, repo2docker
approach is good, however, it requires some separate manual steps. On the other hand, what I'm suggesting would allow complete automation (assuming that the master node has enough resources for building relevant Docker images once in a while).
As for discussing this in a separate issue - I agree. I just mentioned this idea here to see if I'm not missing something obvious or such functionality already exists (can be achieved via existing Helm chart features).
@ablekh Yes, enabling RStudio integration in JupyterHub does require that jupyter-rsession-proxy or nbrsessionproxy be present in the user environment, along with RStudio itself. I've a tendency to roll my own images, but I'm sure there are or will be public images you can extend or re-use without having to manually produce your own.
@choldgraf I think a separate issue for environment building / repo2docker integration into JupyterHub makes sense. I know you and @yuvipanda have been contemplating the concept for a bit.
@ryanlovett I understand and agree. In many cases, public images might not be suitable for one reason or another. My thoughts/suggestions above are focused on building and deploying custom images as well. However, the core idea is to make the process more/fully automated (in the CI/CD fashion) in the context of Zero-to-JupyterHub workflow. It would not only save our time, but reduce the amount of potential mistakes.
@choldgraf I @minrk described how to install it here: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/990#issuecomment-432269851
Notes:
@ryanlovett I think it makes sense to have this in customizing user environ of z2jh, but perhaps also / or within the docker-stacks repo as a "Recipe": https://jupyter-docker-stacks.readthedocs.io/en/latest/using/recipes.html
@consideRatio yes please. this has been one of the hardest things to track down. a working dockerfile no longer cut it for some reason (updates?). Trying out the suggestions in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/990#issuecomment-470299123 (Dockerfile) and if that doesn't work, I'll try minrk's https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/990#issuecomment-432269851
UPDATE: finally got it to work. Weird mixture of issues. This was what did it:
FROM jupyter/r-notebook
RUN python3 -m pip install jupyter-rsession-proxy
RUN cd /tmp/ && \
git clone --depth 1 https://github.com/jupyterhub/jupyter-server-proxy && \
cd jupyter-server-proxy/jupyterlab-server-proxy && \
npm install && npm run build && jupyter labextension link . && \
npm run build && jupyter lab build
# install rstudio-server
USER root
RUN apt-get update && \
curl --silent -L --fail https://download2.rstudio.org/rstudio-server-1.1.419-amd64.deb > /tmp/rstudio.deb && \
echo '24cd11f0405d8372b4168fc9956e0386 /tmp/rstudio.deb' | md5sum -c - && \
apt-get install -y /tmp/rstudio.deb && \
rm /tmp/rstudio.deb && \
apt-get clean && rm -rf /var/lib/apt/lists/*
ENV PATH=$PATH:/usr/lib/rstudio-server/bin
USER $NB_USER
Heya, sorry for invading this issue (since I have no clue where to post here, or at jupyter-rsession-proxy)
Anyway, I have a Jupyterhub deployed on an AKS cluster and I decided to add Rstudio through jupyter-rsession-proxy. I created a Docker image that works just fine. But when using this image on the AKS JupyterHub and try to access user/{myuser}/rstudio/
I keep getting a 500: internal server error
could not start rstudio on time
.
I am sure I am missing something obvious but I cannot figure out what it is 🤔
Also when looking at the pod logs I only get this: so not super helpful
Any help would be massively appreciated 🙏🏼
A big thank you to all the developers! I have had a few researchers at HaasBerkeley ask for this. I tried for a few days and could not get it to work. I then copied the Dockerfile Eric published above and it worked for me with a few minor changes.
Here is the Dockerfile I used:
FROM jupyter/r-notebook
RUN python3 -m pip install jupyter-rsession-proxy
RUN cd /tmp/ && \
git clone --depth 1 https://github.com/jupyterhub/jupyter-server-proxy && \
cd jupyter-server-proxy/jupyterlab-server-proxy && \
npm install && npm run build && jupyter labextension link . && \
npm run build && jupyter lab build
USER root
RUN apt-get update && \
apt-get -y install libssl1.0.0 libssl-dev && \
cd /lib/x86_64-linux-gnu && ln -s libssl.so.1.0.0 libssl.so.10 && ln -s libcrypto.so.1.0.0 libcrypto.so.10 && \
cd /tmp/ && wget https://download2.rstudio.org/server/trusty/amd64/rstudio-server-1.2.5019-amd64.deb &&\
apt-get install -y /tmp/rstudio-server-1.2.5019-amd64.deb && \
rm /tmp/rstudio-server-1.2.5019-amd64.deb && \
apt-get clean && rm -rf /var/lib/apt/lists/*
ENV PATH=$PATH:/usr/lib/rstudio-server/bin
USER $NB_USER
@consideRatio yes please. this has been one of the hardest things to track down. a working dockerfile no longer cut it for some reason (updates?). Trying out the suggestions in #990 (comment) (Dockerfile) and if that doesn't work, I'll try minrk's #990 (comment)
UPDATE: finally got it to work. Weird mixture of issues. This was what did it:
FROM jupyter/r-notebook RUN python3 -m pip install jupyter-rsession-proxy RUN cd /tmp/ && \ git clone --depth 1 https://github.com/jupyterhub/jupyter-server-proxy && \ cd jupyter-server-proxy/jupyterlab-server-proxy && \ npm install && npm run build && jupyter labextension link . && \ npm run build && jupyter lab build # install rstudio-server USER root RUN apt-get update && \ curl --silent -L --fail https://download2.rstudio.org/rstudio-server-1.1.419-amd64.deb > /tmp/rstudio.deb && \ echo '24cd11f0405d8372b4168fc9956e0386 /tmp/rstudio.deb' | md5sum -c - && \ apt-get install -y /tmp/rstudio.deb && \ rm /tmp/rstudio.deb && \ apt-get clean && rm -rf /var/lib/apt/lists/* ENV PATH=$PATH:/usr/lib/rstudio-server/bin USER $NB_USER
update: This is a minimal set-up Dockerfile that seems to allow for RStudio install. Uses nbrsessionproxy
and jupyter-server-proxy
(apparently the same pre-reqs allowed streamlit
to run via jupyterlab):
https://discuss.streamlit.io/t/jupyterhub-streamlit/1238/2
From minimal-notebook
, this appeared to be enough:
RUN pip install jupyter-server-proxy jupyter-rsession-proxy
(I used to have the development version, as evidenced earlier in the thread, but as of time-of-writing, it appears the pypi
version works just fine out of the box).
Note: trailing-slash is important with proxy address, see thread.
Any reason why you are using rstudio-server-1.1.419? I am using the latest rstudio-server-1.2.5033.
Any reason why you are using rstudio-server-1.1.419? I am using the latest rstudio-server-1.2.5033.
Just was the working version when I tried it. I’ve since figured out that the pip install version works just fine, no need to install from source
I noticed that after installing from the rstudio server binary and running in jupyterhub that when I try to run a 1 line python script it demands to download unrelated python binaries. The R studio interface also has a pulldown where it shows out of date r modules and lets the user update them into the docker image which would be lost after their server restarts. Im hoping to have happy users but still block outbound internet and cran connection. Wondering if others had a similar experience.
When running image from this dockerfile I also have a shiny in the jupyter pulldown and an icon in lab that gives a 500 error. Not sure how to remove that yet.
@scivm How do you notice this? I want to make sure it doesnt happen me as well, so understanding if this was a background process or something obvious is relevant for me to know if i may be affected as well for example.
@ablekh: Did you by any chance ever figured out how to use a rocker image through Jupyter hub. I am in the same boat as you and looking for some help.
I'm trying to create a separate JupyterHub cluster for an upcoming workshop that requires using RStudio sessions rather than using R kernel in JupyterHub. Since a proper multi-user RStudio setup can currently only be implemented via commercial version of RStudio Server and because I want to take advantage from JupyterHub's convenient authentication mechanisms (and container-based session isolation), I was working hard to setup a z2jh-based JupyterHub cluster similarly to how Binder enables such setup.
It was my understanding that I could just specify desired version of RStudio-based Docker container image in relevant
config.yaml
without any other changes. I have done just that, selecting the Rocker distribution, specifically therocker/verse
image (to support LaTeX etc.). However, when, after fixing secondary issues, I tried to spawn a single-user container, I was greeted with the following error message on the progress page:2018-10-23 10:44:04+00:00 [Warning] Error: failed to start container "notebook": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "exec: \"jupyterhub-singleuser\": executable file not found in $PATH"
After seeing this, I started thinking that, perhaps, the standard Rocker Docker images (from DockerHub) are not JupyterHub-compatible. If I'm correct on this, then I think that I could build my own relevant image from the source (https://github.com/rocker-org/rocker-versioned/tree/master/verse, I assume; though not clear whether I need to into a specific version-dir), using
jupyter-repo2docker
as suggested by @cboettig here. Any help and/or advice will be much appreciated.
@koners No, I haven't had a chance to further explore the Rocker images route - the priorities have been too dynamic :-). However, I have successfully used the approach suggested by @minrk above.
@koners @ablekh For folks looking to use JupyterHub on Rocker images (to access RStudio or Juypter notebook instances) we recommend the rocker/binder
image (not the rocker/verse
image mentioned above), see https://github.com/rocker-org/binder. (It's based on rocker/verse
but does the setup for you).
Also, open source RStudio server (e.g. in rocker/studio
) supports multiple users just fine as separate linux account users. (The commercial product I think supports multiple users on the same R session, google docs style).
(apologies I must have missed this thread when I was originally tagged so catching up a bit now!)
@cboettig Thank you very much for clarifying this (and please don't worry about missing the thread). I assume that the pro of this approach (vs. the one suggested above by @minrk) is that, in this case, we don't have to manually maintain RStudio versions and the con is that the resulting image would contain some geo-focused packages. Correct?
BTW, have you tested this approach in z2jh environment? I'm somewhat suspicious that z2jh, as a K8s-based JupyterHub setup, might introduce some potential issues. Unfortunately, currently I don't have any available cloud resources to test and confirm or deny my suspicion, but I would be curious to hear the results of relevant testing, should you and/or @koners have such ability.
I'm trying to create a separate JupyterHub cluster for an upcoming workshop that requires using RStudio sessions rather than using R kernel in JupyterHub. Since a proper multi-user RStudio setup can currently only be implemented via commercial version of RStudio Server and because I want to take advantage from JupyterHub's convenient authentication mechanisms (and container-based session isolation), I was working hard to setup a z2jh-based JupyterHub cluster similarly to how Binder enables such setup.
It was my understanding that I could just specify desired version of RStudio-based Docker container image in relevant
config.yaml
without any other changes. I have done just that, selecting the Rocker distribution, specifically therocker/verse
image (to support LaTeX etc.). However, when, after fixing secondary issues, I tried to spawn a single-user container, I was greeted with the following error message on the progress page:After seeing this, I started thinking that, perhaps, the standard Rocker Docker images (from DockerHub) are not JupyterHub-compatible. If I'm correct on this, then I think that I could build my own relevant image from the source (https://github.com/rocker-org/rocker-versioned/tree/master/verse, I assume; though not clear whether I need to into a specific version-dir), using
jupyter-repo2docker
as suggested by @cboettig here. Any help and/or advice will be much appreciated.