Open cablesky opened 9 months ago
I can try to answer some question as I just managed to make nbexchange 1.4 work with JupyterHub 5 and some steps were not very trivial.
Is it installed correctly?
There are two things that should be running (in separate containers):
The server does not require jupyterlab or jupyterhub to run. The commands for enabling extensions should be used for configuring student and instructor notebooks. Basically, swapping the built-in directory-based exchange to nbexchange using nbgrader_config.py
.
In which directory is nbexchange_config.py stored?
It should be installed in the working directory of supervisored
, as set by the key directory
in supervisord.conf.
We use LDAPAuthenticator - how should this be implemented in nbexchange_config.py?
This part is the most complicated.
First, for simplicity I run nbexchange as a JupyterHub-managed service. That is, it is installed in the container that runs JupyterHub.
pip install https://github.com/edina/nbexchange/archive/v1.4.0.tar.gz
In jupyterhub_config.py
add:
c.JupyterHub.services = [
{ # nbexchange service
"name": "nbexchange",
"url": f"http://127.0.0.1:9000",
"command": ["supervisord", "-n", "-c", "/usr/src/app"],
"display": False,
"environment": {
"NBEX_BASE_STORE": os.environ["NBEX_BASE_STORE"],
"NBEX_DB_URL": os.environ["NBEX_DB_URL"],
"COURSE_ID": os.environ["COURSE_ID"],
},
},
]
Note that the relevant environment variables should be also passed to this service.
This service registers a proxy accessible at the endpoint /services/nbgrader
on JupyterHub, to which all nbexchange
requests should now be sent from the user notebooks. These requests will be forwarded by JupyterHub to the url specified in the service.
To get identify of the users, some authentication information should be provided with requests as well.
In nbexchange
authentication is currently implemented only using a cookie NAAS_JWT
:
However JupyterHub services are authenticated using tokens that are sent in request headers. The token (with the required subset of permissions) can be either created by the user from the Token
menu of JupyterHub, or one can use the token of the user server already stored in the environment variable JUPYTERHUB_API_TOKEN
. This token usually has fewer permissions than the user token.
To send the token instead of the cookie with every user request, I had to override the function api_request of the Exchange
class (in nbgrader_config.py
as this part is only relevant to clients):
def api_request(self, path, method="GET", *args, **kwargs):
cookies = dict()
headers = dict()
headers["Authorization"] = "Bearer " + os.environ["JUPYTERHUB_API_TOKEN"]
# ...
# The rest is left unchanged
# ...
Exchange.api_request = api_request
On the service side, I now provide nbexchange_conf.py
that implements BaseUserHandler
to retrieve the required user information from the sent token using the HubAuth
class of Jupyter Hub:
class JupyterHubUserHandler(BaseUserHandler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.hub_auth = HubAuth()
self.course_id = os.environ["COURSE_ID"]
def get_current_user(self, request):
user_model = self.hub_auth.get_user(request)
name = user_model.get("name")
return {
"name": name,
"full_name": name,
"course_id": self.course_id,
"course_title": "cool course",
"course_role": "Student",
"org_id": 1,
"cust_id": 2,
}
However, the base permissions of the token JUPYTERHUB_API_TOKEN
give access only to the basic user information. You can check it using Jupyter Hub Rest API by running from the user notebook:
curl -H "Authorization: token $JUPYTERHUB_API_TOKEN" $JUPYTERHUB_API_URL/user
To provide more meaningful values for "full_name", "course_role" and "cust_id", one has to:
LDAPAuthenticator
mentioned. During Authentication, these values are stored in auth_state
.auth_state
is saved in the user model.JUPYTERHUB_API_TOKEN
to access auth_state
.This is great, many thanks for the details.
Is there merit creating a file describing different installation environments? [we don't use jupyterhub here, so do not have the wisdom you do....]
@perllaghu Many thanks for the nice plugin! Once it is up and running, it works surprisingly well!
It could be nice to move this information to dedicated docs. I can contribute to the part on JupyterHub. It should also be not that difficult to publish the docs in readthedocs, which would be even better for searching information.
As you could see, there could be a few things to improve regarding integration with JupyterHub. E.g., to allow token-based authentication. I will create separate tickets to discuss them.
Before I forget, here are few additional comments for the JupyterHub setup:
If nbexchange
service runs in a privilaged container (--privileged
or user: root
), one has to make sure that /dev/stdout
and /dev/stderr
are writable by the user of the supervisord
command (see stdout_logfile
and stderr_logfile
in supervisord.conf). Otherwise one receives some obscure error:
INFO spawnerr: unknown error making dispatchers for 'nbexchange': EACCES
If using Jupyter Docker Stacks as base image, it is sufficient to add the option CHOWN_EXTRA=/dev/stdout,/dev/stderr
.
Even after granting the user server permissions to access auth_state
as described in stps 1-3 above, the user_model
returned by HubAuth
does not include auth_state
. Looking at the sources of HubAuth
, it becomes evident that this information is retrieved from the /user
endpoint. In the linked discussion above, it is written that auth_state
should instead be retrieved from the /users/{user}
endpoint. To verify this, run the following command in the user notebook (after performing steps 1-3 above):
curl -H "Authorization: token $JUPYTERHUB_API_TOKEN" $JUPYTERHUB_API_URL/users/$JUPYTERHUB_USER
Given this observation, one has to make the following adjustments in nbexchange_conf.py
:
/users/{user}
endpoint using JupyterHub REST API:def get_user(user, token):
import requests
r = requests.get(
os.environ["JUPYTERHUB_API_URL"] + "/users/" + user,
headers={
"Authorization": f"token {token}",
}
)
r.raise_for_status()
return r.json()
auth_state
inside the get_current_user
function:def get_current_user(self, request):
# identify the user
user_model = self.hub_auth.get_user(request)
name = user_model.get("name")
# retrieve the user auth_state
token = self.hub_auth.get_token(request)
user_model = get_user(name, token)
auth_state = user_model.get("auth_state")
# extract the required information from auth_state
full_name = name
course_role = "Student"
cust_id = 0
if auth_state:
# use keys specific to the selected Authenticator class
full_name = auth_state.get("full_name", full_name)
cust_id = auth_state.get("user_id", cust_id)
course_role = ...
return {
"name": name,
"full_name": full_name,
"course_id": self.course_id,
"course_title": self.course_id, # TODO
"course_role": course_role,
"org_id": 1, # TODO
"cust_id": cust_id,
}
I have started nbexchange in a Docker container. In the same Docker network, a JupyterHub 4 is running in Docker Swarm mode.
In the Dockerfile for Jupyter Lab, among other things, it states:
Is it installed correctly?
In which directory is nbexchange_config.py stored?
We use LDAPAuthenticator - how should this be implemented in nbexchange_config.py?
Is there a command to check the configuration?