jupyterhub / nullauthenticator

Null Authenticator for JupyterHub instances that should have no login mechanism
BSD 3-Clause "New" or "Revised" License
9 stars 11 forks source link

Will this work as an 'external' service in a kubernetes context? #2

Open Analect opened 6 years ago

Analect commented 6 years ago

@minrk ... thanks for putting this tool together. I'm trying to use it in the following context, but having difficulties in getting it working properly.

I have my own app which authenticates against firebase using JWT tokens. It includes functionality requiring connecting to jupyterlab, and I wanted to avoid a user having to login twice to two separate systems and so the nullauthenticator fits my needs well, in that sense.

In my case, jupyterhub is using a similar set-up to the zero-to-jupyterhub-k8s and runs on Google's GKE infrastructure. I have been able to setup nbviewer and nbdime as external services to the hub in this context, but setting up nullauthenticator the same way is not working and I can't figure out why. This is my set-up as an external service in the jupyterhub_config.py:

c.JupyterHub.services = [{
            'name': 'login',
            'admin': True,
            'url': 'http://' + os.environ['LOGIN_SERVICE_HOST'] + ':' + os.environ['LOGIN_SERVICE_PORT'],
            'api_token': os.environ['LOGIN_JHUB_TOKEN']
        }]

I have also tried setting things up as an internal service, but that is somehow also failing.

c.JupyterHub.services = [{
            'name': 'login',
            'admin': True,
            'command': [sys.executable, '/srv/login-service.py'],
            'url': 'http://127.0.0.1:4202',
        }]

In an effort to get the internal service working, I have hard-coded the env variable lookup in the login-services.py directly, since it's easier this way than trying to pass them to the kubernetes environment, for now at least.

env = {'JUPYTERHUB_API_URL':'http://127.0.0.1:8081/hub/api', 'JUPYTERHUB_BASE_URL': 'http://myhubserver.com/hub/', 'JUPYTERHUB_SERVICE_URL': 'http://127.0.0.1:4202', 'JUPYTERHUB_API_TOKEN': 'some-fixed-token'}

I continue to get a combination of 403 and 302 errors.

[I 2018-02-18 19:54:02.463 JupyterHub app:1528] Hub API listening on http://0.0.0.0:8081/hub/
[I 2018-02-18 19:54:02.467 JupyterHub app:1538] Not starting proxy
[I 2018-02-18 19:54:02.467 JupyterHub app:1544] Starting managed service login at http://127.0.0.1:4202
[I 2018-02-18 19:54:02.468 JupyterHub service:266] Starting service 'login': ['/usr/bin/python3', '/srv/login-service.py']
[I 2018-02-18 19:54:02.475 JupyterHub service:109] Spawning /usr/bin/python3 /srv/login-service.py
[I 2018-02-18 19:54:02.911 JupyterHub app:1551] Adding external service nbdime at http://10.7.243.191:9000
[I 2018-02-18 19:54:03.049 JupyterHub app:1551] Adding external service nbviewer at http://10.7.247.171:8080
[I 2018-02-18 19:54:03.071 JupyterHub app:1581] JupyterHub is now running at http://10.7.xxx.xxx:80/
[W 2018-02-18 20:05:04.928 JupyterHub base:351] Failed login for unknown user
[W 2018-02-18 20:05:04.955 JupyterHub log:122] 403 GET /hub/login?next=%2Fhub%2Fuser%2Fmccoole%2F%3Ftoken%3Dsome-fixed-token (@10.4.0.1) 28.62ms
[I 2018-02-18 20:05:36.646 JupyterHub log:122] 302 GET / \u2192 /hub (@10.132.0.6) 0.70ms
[I 2018-02-18 20:05:36.959 JupyterHub log:122] 302 GET /hub \u2192 /hub/login (@10.132.0.6) 0.70ms
[W 2018-02-18 20:05:37.272 JupyterHub base:351] Failed login for unknown user
[W 2018-02-18 20:05:37.273 JupyterHub log:122] 403 GET /hub/login (@10.132.0.6) 1.84ms
[W 2018-02-18 20:05:37.898 JupyterHub log:122] 405 POST /q.php (@10.132.0.6) 1.46ms
[W 2018-02-18 20:05:38.521 JupyterHub log:122] 405 POST /s.php (@10.132.0.6) 1.43ms
[W 2018-02-18 20:05:39.144 JupyterHub log:122] 405 POST /wuwu11.php (@10.132.0.6) 1.43ms
[W 2018-02-18 20:05:39.766 JupyterHub log:122] 405 POST /slider.php (@10.132.0.6) 1.50ms
[W 2018-02-18 20:05:40.389 JupyterHub log:122] 405 POST /sheep.php (@10.132.0.6) 1.61ms
[W 2018-02-18 20:05:41.013 JupyterHub log:122] 405 POST /q.php (@10.132.0.6) 1.75ms
[W 2018-02-18 20:05:41.637 JupyterHub log:122] 405 POST /xx.php (@10.132.0.6) 1.51ms
[I 2018-02-18 20:07:00.796 JupyterHub log:122] 302 GET /hub/user/mccoole/?token=mccoole-token \u2192 /hub/login?next=%2Fhub%2Fuser%2Fmccoole%2F%3Ftoken%3Dsome-fixed-token (@10.4.0.1) 1.21ms
[W 2018-02-18 20:07:00.833 JupyterHub base:351] Failed login for unknown user
[W 2018-02-18 20:07:00.835 JupyterHub log:122] 403 GET /hub/login?next=%2Fhub%2Fuser%2Fmccoole%2F%3Ftoken%3Dsome-fixed-token (@10.4.0.1) 1.93ms

Looking at those logs above, none of them show the base url .. just the part /hub/login.... ... could it be that this base url part is not getting captured?

Does this solution somehow require that the port 4202 is exposed externally to the internet, or is it enough that it is just exposed internally within the kubernetes infrastructure? Could that be the reason it is failing ... even to redirect me to the login.html page?

minrk commented 6 years ago

How does your service manage JupyterHub authentication? NullAuthenticator means that users can never make requests directly to the Hub itself (as in BinderHub) They will all fail with 403. Binder uses NullAuthenticator because a separate service (BinderHub) manages users and tokens and uses the Hub only for process management. You must bypass all Hub requests by users if you are using NullAuthenticator.

I suspect the 405s are because your login service isn't handling the fact that it has to run on a url prefix of /services/login. Those requests are for /q.php, when they must be for /services/login/q.php.

Rather than using NullAuthenticator, you may want to write your own Authenticator that talks to your firebase service with auto_login=True. That way hitting the jupyterhub login page only does a redirect to the login service.

Analect commented 6 years ago

@minrk ... thanks. I have someone helping me with authentication, as it's not my forte ... but as I understand it, he wanted to get things working in conjunction with a firebase-generated JWT token ... so that once a user logged into my app for a specified period of time, we would action a login via the nullauthenticator in order to get a generated jupyterhub token that would then get persisted back to firebase for subsequent use against the jupyterhub instance. I'm going to see if I can get him on this thread to better articulate the set-up he envisages. I know he also experimented with using the jwtauthenticator, but somehow that didn't suit.

As regards it not finding a /services/login prefix ... not sure why that is, as I have set things up similar to an nbviewer set-up that works.

nbviewer:
  serviceName: "nbviewer"
  apiUrl: "http://hub:8081/hub/api"
  baseUrl: "http://myjupyterhubserver.com"
  servicePrefix: "/services/nbviewer/"
  apiToken: "555876c099f34f1f12cfef2e5fd0f903edc86b94a464ba6xxxxxxxxxxx"

login:
  serviceName: "login"
  apiUrl: "http://hub:8081/hub/api"
  baseUrl: "http://myjupyterhubserver.com"
  servicePrefix: "/services/login/"
  apiToken: "2a0f05fbebad9fb3289d2fd9c7d77335f87e05d9f72f7d76bxxxxxxxxxxxxx"

When you say:

Rather than using NullAuthenticator, you may want to write your own Authenticator that talks to your firebase service with auto_login=True. That way hitting the jupyterhub login page only does a redirect to the login service.

So in my app, assuming my browser has a relevant hub token, I'm using an iframe to bring a spawned notebook server functionality back within my app ... so I want to have been able to silently get that hub token in place ahead of time. I'm not familiar with this auto_login=True you mention above. How would that work?

minrk commented 6 years ago

As regards it not finding a /services/login prefix

You need to make sure that links in the HTML pages it serves respect the URL prefix. This requires template logic in the files provided by the server to ensure the links are correct. There should be no absolute path URLs in your links (i.e. never hardcode /q.php, always something like {base_url}/q.php).

I'm not familiar with this auto_login=True you mention above. How would that work?

Assuming requests arrive at the Hub authenticated with a jwt token in a cookie, you may want to look at RemoteUserAuthenticator which registers a custom login handler that checks headers (you may check a cookie for the jwt token). That sets login cookie assuming login info has been retrieved and recorded by another service.

KillerGasy commented 6 years ago

Regarding the above @minrk . One of the main use cases for using a separate authenticator was that we would like to authenticate the browser of a user logging in our platform onto our Jupyterhub instance also.

The main way in which our users would be accessing the Jupyterhub front end is by using Iframes to specific URL's.

From my understanding, you can't use IFrame's and headers at the same time unless you do a prefetch or something similar which is why we stemmed away from the RemoteUser authenticator that you mentioned.

So I guess that question would be is there an existing authenticator that will allow us to easily authenticate the IFrame's we are using to access Jupterhub + Services.

Ideally, we want to use the JWT generated by Firebase to do all authentication with.

The other question regarding an authenticator would also relate to how we onboard these users to Jupyterhub. And is the only way to this if there is no existing authenticator that is relevant to build a custom authenticator for our needs?

minrk commented 6 years ago

The model of REMOTE_USER can be used no matter where the information comes from - query parameters, headers, cookies. Not the REMOTE_USER implementation itself, which assumes specific apache behavior, but writing your own Authenticator with a similar pattern, tailored to however you identify users, e.g. jwt from firebase.

KillerGasy commented 6 years ago

@minrk Thanks for explaining that and clarifying!