Closed ryanlovett closed 3 years ago
Thanks Ryan!
On our end, we're just adding binder to our Docker images via this script: https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/install_binder.sh (currently it is installing an outdated RStudio to work around the issue), and then the binder images set the default CMD as:
jupyter notebook --ip 0.0.0.0
https://github.com/rocker-org/rocker-versioned2/blob/master/dockerfiles/Dockerfile_binder_4.0.3#L15.
It would be great if we could set something like RSERVER_WWW_ROOT_PATH
or such in the image and have this all magically work, but I'm not quite clear on what the syntax would be for configurable
to pick that up?
It seems that setting the option to just /rstudio/ works on localhost, but not on jupyterhub or binder. We need to set the full server path, e.g. /user/{username}/rstudio/
.
With www root path set and when visiting rserver directly via curl, it issues a Location header to the full server URL with /auth-sign-in?appUri=%2F
at the end, e.g. /user/{username}/rstudio/auth-sign-in?appUri=%2F
. However when visiting via proxy, we see a Location header with only /auth-sign-in?appUri=%2F
. Not sure why the rest of the path is not present. I think I need to debug jupyter-server-proxy.
If one manually visits the correct auth sign in URL (again with the root path set), we receive the cookie and are properly redirected to a working RStudio.
I've tracked this down to the header X-Forwarded-Proto: https,http
.
https
instead of https,http
, the header value uses https:// rather than http://.All of this is when we set --www-root-path.
I need to find out what the header is for before possibly reporting it upstream.
The X-Forwarded-*
headers are often added by load-balancers to indicate the original request when proxying through to a backend. For example, you might have https for external traffic but http for internal:
user <-> https://external-load-balancer <-> http://internal-jupyter
In this example external-load-balancer
would add the header X-Forwarded-Proto: https
when forwarding to internal-jupyter
.
If you have multiple proxies you can end up with multiple protocols as the header value https,http
.
This has caused problems in JupyterHub due to limitations in Tornado which doesn't pass through the full header value, instead it picks just one of the protocols which may not be the one Jupyter wants. For example, see:
Thanks @manics ! I was testing this on a hub behind an ssl-terminating nginx instance, and it turns out that nginx is setting that header.
I think in this case, rserver should preserve the full value of the Location header regardless of what X-Forwarded-Proto is set to. Ideally it would actually not insert the protocol or server name either, e.g. /user/{username}/rstudio/auth-sign-in?appUri=%2F
. But you've got me thinking that I need to double-check all of the headers I'm seeing before I report this.
Good luck figuring it out! I've spend many hours digging into these sorts of problems in the past....
@cboettig What do you think are the changes that Rstudio might be willing to provide some support to figure this out? I would be really great to be able to use Rstudio 1.4 with Jupyter and jupyterhub.
@vnijs Even after adding --www-root-path (and path) and a couple of other options, one of the proxied http headers causes rstudio to send the client to the wrong place. I almost filed an issue with rstudio but wanted to triple-check that either the proxy wasn't doing something wrong or that it could work around the problem. Sorry for letting this drop. I'll try to pick it up again soon.
Hey @ryanlovett @vnijs ; funny timing. I got 1.4 running here:
https://github.com/StatCan/kubeflow-containers/pull/195
I added the www-root-path to the conf file directly in the dockerfile, and also had to deal with an ownership issue on some sqlite database. The changes on the proxy side are here:
And then the dockerfile side is here:
Thanks @blairdrummond! I think if you try that on mybinder or any service where a proxy is setting X-Forwarded-Proto
, you may run into trouble. At least that's what I found with something similar to your patch and 1.4. Are you able to reproduce?
Ah crap, you're totally right
Have you tried playing with the absolute_url
option in jupyter-server-proxy? https://jupyter-server-proxy.readthedocs.io/en/latest/server-process.html
If you want to check the underlying code: https://github.com/jupyterhub/jupyter-server-proxy/blob/81a8e75ace301d562c997916ef3694d57c49d365/jupyter_server_proxy/handlers.py#L133-L138 https://github.com/jupyterhub/jupyter-server-proxy/blob/81a8e75ace301d562c997916ef3694d57c49d365/jupyter_server_proxy/handlers.py#L172-L179
@ryanlovett I'm not sure what the best mechanism is for this, but hacked this together and it does seem to be behaving:
Where RSTUDIO_WWW_ROOT_PATH=/notebook/blair-drummond/rstudiotest/rstudio
, for example. ($NB_PREFIX/rstudio
on Kubeflow)
@ryanlovett have you seen the header mentioned here?
https://github.com/kubeflow/kubeflow/issues/2208#issuecomment-702696927
@blairdrummond I believe setting --www-root-path and the other options is not sufficient for 1.4 to work. However I did try your patch in a local docker container and on binder with a recent 1.4 build and it didn't work for me. Do you have a reproducible environment I could test?
Here is what I tried: https://github.com/ryanlovett/binder-rsession https://mybinder.org/v2/gh/ryanlovett/binder-rsession/HEAD
Regarding X-Forwarded-Proto in https://github.com/jupyterhub/jupyter-rsession-proxy/issues/95#issuecomment-730646393, I considered altering this header value to avoid the problematic behavior, but this could turn into whack-a-mole. For example we could have jupyter-server-proxy set the header to http (since that is how it is being served here), but a later version of RStudio my decide it doesn't want to be served that way. If X-F-P was always http, RStudio would think that the client wasn't communicating over https and issue a redirect to the https endpoint. This would cause a loop.
So I'm thinking that the right way forward is to discuss the X-F-P behavior with RStudio.
Also, I'm aware that I have turned my issue into "fix all 1.4 related proxy things" which I shouldn't have. Sorry about that. I'll commit either my or @blairdrummond's root path patches and then create a new issue for X-Forwarded-Proto.
Thanks much @ryanlovett . Definitely support touching base with the RStudio team on this. As you know they've always been very responsive in the past!
Created #96 to address the root path issue. But again, I don't think this is enough for the extension to work with RStudio 1.4 in all places, based on the X-Forwarded-Proto issue.
I've created https://github.com/rstudio/rstudio/issues/8888 which is a very appropriate number for a jupyter-related issue!
I'm wondering if it'd be simpler to just immediately send the user to /rstudio/auth-sign-in?appUri=%2F without waiting for RStudio to do it for us. I don't know what the downsides are of visiting this more than once, but it doesn't seem to hurt.
Just catching up on this and wanted to say huge thanks @ryanlovett and everyone involved in getting Binder/RStudio to play. If you want any testing or help in any way I'm keen, although this is a bit over my head. Will greatly help people trying to organise teaching sessions in reproducible and accessible environments for people with diverse backgrounds.
Fixed by #96, but we still have the X-Forward-Proto issue in #97.
Oh, and just to underscore a new note in the docs:
If you want to use RStudio 1.4.x, you need to set the RSESSION_PROXY_RSTUDIO_1_4 environment variable. This adds additional parameters to the execution of rserver.
Thanks @ryanlovett! Question: What should you set RSESSION_PROXY_RSTUDIO_1_4
to? A path? True?
@vnijs Sorry, you can just set it to be anything, e.g. 1, yes, no, false. It just can't be the empty string.
@manics absolute_url=True has no effect on the auth issue unfortunately. :(
@vnijs Sorry, you can just set it to be anything, e.g. 1, yes, no, false. It just can't be the empty string.
Thanks. I have a question, where can we set the environment variable RSESSION_PROXY_RSTUDIO_1_4? Insert the following line into the dockerfile?
ENV RSESSION_PROXY_RSTUDIO_1_4=yes
@ShichenXie Yep, that'd work.
RStudio devs have suggested in https://github.com/rstudio/rstudio/pull/7628 that RStudio 1.4 requires that proxies need rserver's
--www-root-path
. We can do this via a configurable or and environment variable.