Closed perllaghu closed 5 years ago
Great idea, thanks for creating this thread. Maybe let's explicitly ping some of the other people who have been thinking about this lately or might be interested in this discussion: @FranLucchini @rkdarst @sgvasquez @fishfree @zonca @sigurdurb @yuvipanda @dsblank
(Please feel free to reference any additional issues/PRs if I missed anything)
course_id
, user_id
, and role
values (Devolves authentication & authorisation)We want users identified as instructors
to have access to formgrader
whilst everyone else gets assignments
[done]. We need to cope with multiple classes, and cannot have students able to "borrow" another students submission. [doing]
Our current plan is to dynamically create nbgrader_config .py
at notebook startup, and set CourseDirectory.root
to /courses/<course_id>
- this will separate each course into it's own namespace
For the exchange stuff, the notebook Docker images will volume-mount:
For outbound
:
/srv/nbgrader/exchange/outbound -> [external_stroage_point]/<course_id>/outbound
For inbound
, it depends on whether you're an instructor
or a student
(this is what stops students seeing each others work):
/srv/nbgrader/exchange/inbound -> /disk/remote/courses/<course_id>/inbound # Instructors
/srv/nbgrader/exchange/inbound -> /disk/remote/courses/<course_id>/inbound/<student_id> # students
However, this does mean we need to modify the collect
code to iterate over every student
directory
Future work is to either adopt hubshare [https://github.com/jupyterhub/hubshare], or write our own version.
We would also push for a central nbgrader.db
, and add the functionality for multiple courses & the like to it... and an API to allow instructors/students to work any course they've previously been associated with, and for the inbound
/outbound
stuff to all of interacted with through an API (which then does authorization with this central database)
Also related: #1010
Currently using the basic system, and we intend to use deployment with either Docker or Kubernetes in a future.
To allow users to be enrolled in more than one course at a given time, having an user-friendly UI that allows them to browse their courses and assignments in a simple way (current solution in docs allow students to have access to all courses at a given time, which in a real scenario would mean 100+ courses).
To allow graders to be graders in more than one course at the same time (having full access to release, grade and feedback assignments in all their courses), without having to change any configuration files.
To allow graders to be student in courses, different than the courses where they are graders, without losing functionality or having access to information they shouldn't have since they have the rank of grader in another course.
Based on the solution proposed in this issue, we intend to create an external service (we are considering using Flask for this) that allows users to query for the courses they are in a student, the courses they are in as graders, and the students they have in a given course if they are graders in said course. This separation will make it easier to work with another systems if we plan to integrate more systems to JHub/Nbgrader.
This would also include changes to the UI, but its implementation is dependent on whether we can do the backend changes - or not.
We'd like to ask for guidance on where to look at the files to start making changes. Personally, I've been toying around trying to change how Nbgrader manages the course_id, but with so many uses of the variables in so many different files is not easy to just go and edit the code. Any insight would be appreciated :)
Rather than an external service, you can build on JupyterHub's services. See for example these two services:
https://github.com/BrynMawrCollege/jupyterhub
You can make sure that they are logged in, and then perform the various functions (retrieving class rosters for the teacher, etc.). Sounds like a great idea!
@dsblank Thanks for the feedback! We are planning to work on this ASAP, although I haven't worked with services before. Is there any guide on how to make good services? I'm thinking that for our solution (I'm working with @FranLucchini in this), we'll need to be able to pass parameters (current user id, course id, student ids in case of a grader) and receive responses to/from the service dinamically that will affect what the UI shows.
The main docs are here: https://jupyterhub.readthedocs.io/en/stable/reference/services.html
It is just a regular web endpoint, so you should be able to do whatever you want. You will be able to check to see if the user is logged in, and get their user_id, for example.
@perllaghu I think integration with LTI is very necessary, for there are already many ID providers, especially in Learning Management Systems.
@fishfree - remember that NBGrader itself has no need to know about LTI .... however it would be good if it coukd work in a JupyterHub environment driven by LTI connections.
I was thinking about this over the weekend.... and I think there are three distinct, but inter-related, parts that need to be resolved:
formgrader
and assignements
functions need to be able to cope with multiple courses (and, consequently, multiple roles)exchange
functionality and move towards an API-based system (such as hubshare
.)LTI
as a potential access method for a Notebooks/Jupyterhub serviceTaking the LTI
issue first: this should not be something for NBGrader to concern itself with, but for some _prestart hook to manage. It's impact on NBGrader is that users, courses, and roles (students v's instructors) is not known at the time the jupyterhub service is created, so needs to be managed in the pre-start hook area.
The first issue, of multiple courses, seems to have a solution of extending the Jupyterhub User model so one can use that to get a list of courses for each user.
This leaves us with modifying the exchange
, and the use of hubshare
(https://github.com/jupyterhub/hubshare) - what is the state of this work?
What other issues are there that I've missed?
jupyter/scipy-notebook
:
nbgitpuller
and nbgrader
NB_HOME="/home/$NB_USER"
/usr/sbin/usermod -u $NB_UID -d "$NB_HOME" -l "$NB_USER" jovyan
sed -r "s#Defaults\s+secure_path=\"([^\"]+)\"#Defaults secure_path=\"\1:$CONDA_DIR/bin\"#" /etc/sudoers | grep secure_path > /etc/sudoers.d/path
mkdir -p "$NB_HOME"
chown -R "$NB_USER:" "$NB_HOME"
cd "$NB_HOME"
exec sudo -E -H -u "$NB_USER" PATH="$PATH" XDG_CACHE_HOME="/home/$NB_USER/.cache" PYTHONPATH="${PYTHONPATH:-}" $cmd
- Special Helm chart values to make it work:
```yaml
...
hub:
extraConfig:
environment: |-
import string
safe_chars = set(string.ascii_lowercase + string.digits)
c.KubeSpawner.environment = {}
c.KubeSpawner.environment["NB_UID"] = lambda self: str(1000 + int(self.user.id))
c.KubeSpawner.environment["NB_USER"] = lambda self: ''.join([s if s in safe_chars else '-' for s in self.user.name.lower()])"
priv: "c.KubeSpawner.privileged = True"
uid: "c.KubeSpawner.uid = 0"
...
singleuser:
cmd: "start-singleuser.sh"
image:
name: ".../singleuser"
tag: "..."
storage:
dynamic:
storageClass: "nfs-client"
extraVolumeMounts:
-
mountPath: "/srv/nbgrader/exchange"
name: "exchange"
extraVolumes:
-
name: "exchange"
persistentVolumeClaim:
claimName: "exchange"
homeMountPath: "/home"
@moschlar - you have an environment very similar to ours... except we have multiple "single-user" notebooks, and an external service that identifies the user & which notebook they're wanting to use.
How are you managing persistent storage for your users?
@perllaghu
How are you managing persistent storage for your users?
We have central NFS storage (NetApp appliance) which exports a volume (with sec=sys) for the Kubernetes cluster. And then we use nfs-client-provisioner to provision directories for each PersistentVolumeClaim. Seems to work so far!
@moschlar Very similar to us - and we're at >14,000 users, and I've seen >150 simultaneously
We've squashed
the NFS permissions, so all files on disk are owned by jovyan
and mount just the user's space into the notebook docker image (seemed easier than mucking about with UIDs in notebooks, and the security issue is fairly low)
Our problem is the NFS mounting for the exchange [for multiple courses] - it's just not proving feasable, hence the comment above about needing an exchange service to manage that.
Today was the first lecture and we had ~160 simultaneous users (should be ~300, if the wifi would work :grimacing:).
We currently only have one course so that problem didn't come up yet...
But while reading through the docs and source code, I made a mental note that we could maybe solve it using the Spawner options form, allowing the students to select a course while spawning their Notebook server. Did you consider that?
That's how we did it..... and when I have time, I'll find the bits of code we have :) (though note that our code is in an eternal front-end, which tells the hub what to spawn, and for whom)
@moschlar - this is a cut-down version of the options dict we pass to the kubespawner
in JupyterHub.
# users are DJango users, Containers is a table of available notebooks
# self is a user-object, pass in container table PrimaryKey
def jupyter_start(self, container_pk):
containers = Container.objects.all()
container = containers.get(pk=container_pk)
# Define volume mount from NFS mount
mounts = [{
'name': 'home',
'hostPath': '/disk/remote{}'.format(self.user_directory),
'mountPath': '/home/jovyan'
}]
# set up the options dictionary
user_options = {
# list of constraints
'image': container.service_name, # name of the image
'labels': {
'username': self.username,
'container': container.display_name.lower().replace(' ', '_'),
'type': 'naas'
},
'volumes': mounts
}
return requests.post('http://{}/hub/api/users/{}/server'.format(settings.JUPYTER_URL, self.username),
headers={'Authorization': 'token ' + self.jupyter_token}, data=json.dumps(user_options), verify=False)
def jupyter_shutdown(self):
return requests.delete('http://{}/hub/api/users/{}/server'.format(settings.JUPYTER_URL, self.username),
headers={'Authorization': 'token ' + self.jupyter_token}, verify=False)
Hi. Does nbgrader support multiple courses now ? I am interested in integrating nbgrader with JupytrHub such that a single installation of JupyterHub can support multiple courses and multiple instrcutors.
@alphadevuser One way of having multiple instructors is just by allowing them access to the service where you are running the course from. As for multiple courses for each student I am already working on #1040 . Let me know if you are going to work on it, I have not taken the time to refactor it and test in some time, fresh eyes are always welcome.
Hi @sigurdurb ! Thanks for the response. I am working on a personal project, and unfortunately won't be able to dedicate time on this codebase. I really appreciate the effort you are putting in. Hopefully, I would be able to contribute soon.
Question on #1040 : This doesn't seem support the functionality to create courses on the fly. I mean you can add a new course when starting the jupyterhub server only. When the server is up, one can't create courses. Am I right ?
There's a workshop coming up in Edinburgh soon.... and I'm hoping we can address this.
It ties in with using LTI connections, with department-/campus-wide jupyterhub installations, and with a couple of other issues
I'm late, but here's what we have...
jupyterhub_config.py
Multiple courses
KubeSpawner.profile_list
a callable.pre_spawn_hook
to mount the right course data and select the right container image.Multiple users
pre_spawn_hook
Multiple instructors
pre_spawn_hook
.pre_spawn_hook
).All of this already works and many courses have used it. Some improvements could make things smoother, but really most work we do is local integration. If formgrader would support multiple courses, it could be designed so that you don't have to select the course at spawn-time... though this may still be needed to select the right container image.
Student and instructor UI (visible options depend on who is logged in):
Hi all, I wanted to give a quick update on this issue in light of what we've been working on at the Edinburgh hackathon! We made the decision to make nbgrader more compatible with multiple courses, with the following changes:
nbgrader list
and the Assignment List based on which courses a student is actually a part of (#893, #1040). This is configurable and by default there are no access controls, but nbgrader includes access controls for JupyterHub and you can write your own custom plugin too if you want (#1093).course_id
as an argument (#1118).Note that even with all these changes, we are still making the following assumptions:
nbgrader_config.py
files.I plan to address these at some point which will make the setup for multiple classes even smoother, but probably not immediately. But I hope that the changes we've made so far will make having multiple classes a much nicer experience than it is currently!
Huge thanks to @sigurdurb @BertR and @damianavila who worked on this at the hackathon!! 🥇
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/jupyter-nbgrader-hackathon-topics/1040/1
This is a discussion-point, to try and bring together the various threads
We have a number of similar threads on multiple courses, multiple instructors, and the possibility of an external exchange service.
We all want to use NBGrader, as it's an excellent tool for creating tests & grading the responses.
It would be good to get a common solution, or at least several non-exclusive solutions, for these issues.
Please could people describe what they are trying to achieve, and the environment in wich they are running nbgrader?