ideonate / cdsdashboards

JupyterHub extension for ContainDS Dashboards
https://cdsdashboards.readthedocs.io/
Other
200 stars 38 forks source link

Error when opening a dashboard behind secured reverse-proxy #22

Closed fcollonval closed 4 years ago

fcollonval commented 4 years ago

Describe the bug A clear and concise description of what the bug is.

Best is to understand the problem is to look at the gif below.

To Reproduce Clear steps to reproduce the behavior, including any of your own ipynb or py files if they led to the error.

Screenshots Please add screenshots to help explain your problem. cdsdash_auth_error

The failure raises from https://github.com/jupyterhub/jupyterhub/blob/e5a6119505f89a293447ce4e727c4bd15e86b145/jupyterhub/apihandlers/auth.py#L277-282

But from the web dev tools, the Referer header matches the page URL.

Configuration Include as much jupyterhub_config information as you can - at least enough to understand which Spawner type you are using, and how your JupyterHub is deployed (e.g. The Littlest JupyterHub, or Zero to JupyterHub).

cdsdashboards 0.3.0 ldapauthenticator 1.3.0 jupyterhub 1.1.0 jupyterhub-base 1.1.0

spawner_class = ( "cdsdashboards.hubextension.spawners.VariableLocalProcessSpawner" )

fcollonval commented 4 years ago

@danlester Ok I found the error - but I don't know yet how to correct it.

self.log.error("OAuth POST from %s != %s", referer, full_url)

Log message:

OAuth POST from https://.../hub/api/oauth2/authorize?client_id=jupyterhub-user-jnaour-dash-stats-on-true-values&redirect_uri=%2Fuser%2Fjnaour%2Fdash-stats-on-true-values%2Foauth_callback&response_type=code&state=eyJ1dWlkIjogImViZmZhZGI5YTk0ZjRkNDliN2NkMDY4NjlmY2VhNWUyIiwgIm5leHRfdXJsIjogIi91c2VyL2puYW91ci9kYXNoLXN0YXRzLW9uLXRydWUtdmFsdWVzLyJ9 != http://...

Note that the full_url does not have https when the referer has it... The certificate are set in a proxy in front of JHub and not in JHub directly. I think this is the source. And it probably should be tackled in JupyterHub core.

fcollonval commented 4 years ago

Chances are that I need to configure the proxy as in that example for JupyterHub with nginx:

https://gist.github.com/cboettig/8643341bd3c93b62b5c2#file-nginx-conf-L32

danlester commented 4 years ago

Thank you very much for this report.

Is there something that you were trying to change/achieve that led to this error (since everything worked before for you) - were you just trying to move everything behind the ngnix server and install https?

Please let us know a bit more of the nginx config if you are still having problems.

Does the usual singleuser Jupyter server work for your users?

fcollonval commented 4 years ago

Is there something that you were trying to change/achieve that led to this error (since everything worked before for you) - were you just trying to move everything behind the ngnix server and install https?

All our services are served via a traefik proxy. And the error occurs on that production infrastructure when a user tried sharing a dashboard. On local installation everything run fine.

Please let us know a bit more of the nginx config if you are still having problems.

Unfortunately this is traefik and not nginx. I try redirecting every request to https but I don't think it is properly configured because I successfully execute the POST request on the insecure url - that of course failed at a latter stage as the oauth state was not correct. So definitely this is something to be done at the proxy level.

Does the usual singleuser Jupyter server work for your users?

Single-user server and dashboard run fine if opened by their owner. The trouble comes when the authorize page is displayed to allow non-owner users to access dashboards.

danlester commented 4 years ago

It sounds quite likely that the authorize page is where there is a problem - this is rarely hit in normal JupyterHub setups.

In the simplest case, we could just drop the authorize page in the ContainDS Dashboards setup too - it's only really there to make it a bit more obvious what's happening, and is probably overkill for most situations.

I'll take a look to see if I can reproduce this problem somehow anyway.

fcollonval commented 4 years ago

My current results are:

fcollonval commented 4 years ago

I discover that JHub can generate and handle internal SSL certificates.

Unfortunately if activated:

c.JupyterHub.internal_ssl = True

A dashboard server is not starting. The error is:

[D 2020-08-03 12:12:27.922 JupyterHub pages:217] Triggering spawn with default options for june:dash-tse
[D 2020-08-03 12:12:27.922 JupyterHub base:860] Initiating spawn for june:dash-tse
[D 2020-08-03 12:12:27.922 JupyterHub base:867] 0/100 concurrent spawns
[D 2020-08-03 12:12:27.922 JupyterHub base:872] 1 active servers
[D 2020-08-03 12:12:27.943 JupyterHub user:600] Creating internal SSL certs for june:dash-tse
[I 2020-08-03 12:12:27.943 JupyterHub spawner:924] Creating certs for june:dash-tse: DNS:localhost;IP:127.0.0.1
[D 2020-08-03 12:12:27.957 JupyterHub user:603] Calling Spawner.start for june:dash-tse
[I 2020-08-03 12:12:27.957 JupyterHub spawner:1455] Spawning python3 -m jhsingle_native_proxy.main --destport=0 python3 '{-}m' voila '{presentation_path}' '{--}port={port}' '{--}no-browser' '{--}Voila.base_url={base_url}/' '{--}Voila.server_url=/' --presentation-path=/home/june/Untitled.ipynb --port=43009 '{--}debug' --debug '{--}template=materialstream'
[D 2020-08-03 12:12:27.970 JupyterHub spawner:1151] Polling subprocess every 30s
[I 2020-08-03 12:12:28.924 JupyterHub log:181] 302 GET /hub/spawn/june/dash-tse -> /hub/spawn-pending/june/dash-tse (june@::ffff:172.18.0.1) 1003.53ms
12:12:28.935 [ConfigProxy] debug: PROXY WEB /hub/spawn-pending/june/dash-tse to https://127.0.0.1:8081
[I 2020-08-03 12:12:28.946 JupyterHub pages:401] june:dash-tse is pending spawn
[D 2020-08-03 12:12:28.948 JupyterHub log:181] 304 GET /hub/spawn-pending/june/dash-tse (june@::ffff:172.18.0.1) 5.98ms
12:12:28.998 [ConfigProxy] debug: PROXY WEB /hub/dashboards-static/css/style.css to https://127.0.0.1:8081
[D 2020-08-03 12:12:29.004 JupyterHub log:181] 304 GET /hub/dashboards-static/css/style.css (@::ffff:172.18.0.1) 0.66ms
12:12:29.038 [ConfigProxy] debug: PROXY WEB /hub/api/users/june/servers/dash-tse/progress to https://127.0.0.1:8081
[W 2020-08-03 12:12:37.925 JupyterHub base:1020] User june:dash-tse is slow to become responsive (timeout=10)
[D 2020-08-03 12:12:37.925 JupyterHub base:1025] Expecting server for june:dash-tse at: https://127.0.0.1:43009/user/june/dash-tse/
[W 2020-08-03 12:13:09.335 JupyterHub user:744] june's server never showed up at https://127.0.0.1:43009/user/june/dash-tse/ after 30 seconds. Giving up
[D 2020-08-03 12:13:09.335 JupyterHub user:791] Stopping june:dash-tse
[D 2020-08-03 12:13:09.336 JupyterHub spawner:1550] Interrupting 152

Aborted!
Setting debug
Starting jhsingle-native-proxy server on address None port 43009, proxying to port 0
URL Prefix: /user/june/dash-tse
Auth Type: oauth
Command: ('python3', '{-}m', 'voila', '{presentation_path}', '{--}port={port}', '{--}no-browser', '{--}Voila.base_url={base_url}/', '{--}Voila.server_url=/', '{--}debug', '{--}template=materialstream')
[E 2020-08-03 12:13:09.375 JupyterHub ioloop:763] Exception in callback functools.partial(<function _HTTPConnection.__init__.<locals>.<lambda> at 0x7fd6268fd950>, <Task finished coro=<_HTTPConnection.run() done, defined at /usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py:289> exception=OSError(0, 'Error')>)
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 743, in _run_callback
        ret = callback()
      File "/usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py", line 286, in <lambda>
        gen.convert_yielded(self.run()), lambda f: f.result()
      File "/usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py", line 336, in run
        source_ip=source_ip,
      File "/usr/local/lib/python3.6/dist-packages/tornado/tcpclient.py", line 294, in connect
        False, ssl_options=ssl_options, server_hostname=host
      File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1417, in _do_ssl_handshake
        self.socket.do_handshake()
      File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
        self._sslobj.do_handshake()
      File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
        self._sslobj.do_handshake()
    OSError: [Errno 0] Error

[E 2020-08-03 12:13:09.375 JupyterHub ioloop:763] Exception in callback functools.partial(<function _HTTPConnection.__init__.<locals>.<lambda> at 0x7fd6268f06a8>, <Task finished coro=<_HTTPConnection.run() done, defined at /usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py:289> exception=OSError(0, 'Error')>)
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 743, in _run_callback
        ret = callback()
      File "/usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py", line 286, in <lambda>
        gen.convert_yielded(self.run()), lambda f: f.result()
      File "/usr/local/lib/python3.6/dist-packages/tornado/simple_httpclient.py", line 336, in run
        source_ip=source_ip,
      File "/usr/local/lib/python3.6/dist-packages/tornado/tcpclient.py", line 294, in connect
        False, ssl_options=ssl_options, server_hostname=host
      File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1417, in _do_ssl_handshake
        self.socket.do_handshake()
      File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
        self._sslobj.do_handshake()
      File "/usr/lib/python3.6/ssl.py", line 689, in do_handshake
        self._sslobj.do_handshake()
    OSError: [Errno 0] Error

[D 2020-08-03 12:13:09.531 JupyterHub user:819] Deleting oauth client jupyterhub-user-june-dash-tse
[D 2020-08-03 12:13:09.542 JupyterHub user:822] Finished stopping june:dash-tse
[E 2020-08-03 12:13:09.547 JupyterHub gen:599] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py:880> exception=TimeoutError("Server at https://127.0.0.1:43009/user/june/dash-tse/ didn't respond in 30 seconds",)> after timeout
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 593, in error_callback
        future.result()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 887, in finish_user_spawn
        await spawn_future
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 720, in spawn
        await self._wait_up(spawner)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 767, in _wait_up
        raise e
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 735, in _wait_up
        http=True, timeout=spawner.http_timeout, ssl_context=ssl_context
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 234, in wait_for_http_server
        timeout=timeout,
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 177, in exponential_backoff
        raise TimeoutError(fail_message)
    TimeoutError: Server at https://127.0.0.1:43009/user/june/dash-tse/ didn't respond in 30 seconds

I presume this concerns jhsingle-native-proxy or voila rather than this project.

In the simplest case, we could just drop the authorize page in the ContainDS Dashboards setup too - it's only really there to make it a bit more obvious what's happening, and is probably overkill for most situations.

My knowledge on the subject is limited. But it seems wrong to skip it because of the reverse-proxy.

danlester commented 4 years ago

Thanks for the update!

To confirm, I can reproduce the original 'auth form must be sent from auth page' problem even using standard singleuser notebook servers (assuming that I force needs_oauth_confirm to always return True).

When internal_ssl is True, I also see the dashboards fail to start (or more precisely, be able to be polled). This is because jhsingle-native-proxy doesn't currently respect the internal SSL settings. It would be a relatively easy fix to make this happen, just doing similar things to the NotebookApp when it starts its Tornado listener.

However, I'm not sure why you think this will solve your original problem.

In particular, does internal_ssl=True fix your original problem for standard singleuser notebook servers when you force oauth through the Authorize button page?

I am also not convinced that internal_ssl will work correctly using named servers (even standard singleuser notebooks) since I notice that each new server generates new certs for the user, possibly invalidating those that any existing servers believe are in use - unless JupyterHub is holding these in memory for each spawner. I haven't checked fully.

I haven't been able to replicate your setup when internal_ssl=True - i.e. I haven't been able to get this to work behind a reverse-proxy (in my case, I'm trying nginx) so if you can share more about your SSL config that would be great (if you haven't already solved the overall problem first...).

So I think we need to see if:

  1. We can solve the original problem for multiple named servers running singleuser notebooks but when oauth is forced through the Authorize page (e.g. if internal_ssl=True truly solves this problem). If so, we should bring jhsingle-native-proxy up to date to match.

  2. If JupyterHub OAuth is broken at that level, we need to fix something upstream (e.g. internal_ssl under named servers, or just the OAuth handler itself to cater for this scenario).

Agreed we shouldn't skip the auth page just to solve this problem if we think it's important - but it's a genuinely open question as to whether this is useful for any particular JupyterHub installation. In any case, a fix (e.g. configuration update) would need to be on the JupyterHub side, or at least patched somehow from ContainDS Dashboards.

danlester commented 4 years ago

The master version of jhsingle-native-proxy (to be released as 0.5.0 in a few days) respects the internal SSL settings and seems to work just like the regular notebook servers in a JupyterHub with internal_ssl=True.

However, I'm still not sure if this solves the original 'auth form must be sent from auth page' problem. (I haven't been able to get internal_ssl=True working behind an extra proxy yet.)

@fcollonval it would be fantastic if you're able to explain why you thought internal_ssl=True would help, and even better if you can test!

fcollonval commented 4 years ago

Hey @danlester thanks for looking deeply into this. So the idea is quite dirty, but I hoped to get the request to the hub server switched back to https thank to the internal certificates. And so the scheme would be the same for the request and the Referer.

If I got the JupyterHub structure correctly, the call stack looks like:

image

But this is only because I do not have admin rights on the reverse proxy.

danlester commented 4 years ago

Great diagram! If you get a chance to try it out with the updated jhsingle-native-proxy, please let us know.

Without using internal_ssl=True, I have been able to solve the problem with nginx. This may help you or others...

In the nginx config I added:

proxy_set_header X-Forwarded-Proto 'https';

And made sure to start CHP of JupyterHub as follows:

c.ConfigurableHTTPProxy.command = ['configurable-http-proxy', '--no-x-forward']

When nginx set X-Forwarded-Proto to 'https', CHP changed this to 'https,http' (until I started it with the --no-x-forward flag).

It's possible that your Traefik (which you are using in place of my nginx) is already correctly passing https under X-Forwarded-Proto but that CHP is subsequently corrupting it by appending http. So could be worth trying the --no-x-forward flag.

If not, maybe try adding a check to understand what is being passed through, by adding a log line to your JupyterHub code again:

self.log.error("{} | {}".format(self.request, self.request.headers))

Ultimately, this may need changes to your Traefik config of course. I think it is probably a bit too obscure to change JupyterHub for this reason since it would normally be expected that you have admin access to Traefik.

fcollonval commented 4 years ago

Great diagram! If you get a chance to try it out with the updated jhsingle-native-proxy, please let us know.

I used mermaid-js editor. Then copy paste the png export :wink:

And made sure to start CHP of JupyterHub as follows:

c.ConfigurableHTTPProxy.command = ['configurable-http-proxy', '--no-x-forward']

It's possible that your Traefik (which you are using in place of my nginx) is already correctly passing https under X-Forwarded-Proto but that CHP is subsequently corrupting it by appending http. So could be worth trying the --no-x-forward flag.

You rock this works. Thanks a lot for the help.

ghost commented 3 years ago

Thanks, this works in an apache https proxy environment too. In the site config:

RequestHeader set X-Forwarded-Proto "https"

In jupyterhub_config.py:

c.ConfigurableHTTPProxy.command = ['configurable-http-proxy', '--no-x-forward']