berkeley-dsep-infra / datahub

JupyterHubs for use by Berkeley enrolled students
https://docs.datahub.berkeley.edu
BSD 3-Clause "New" or "Revised" License
62 stars 37 forks source link

Data100 Autograding VM #1219

Closed wwhuang closed 4 years ago

wwhuang commented 4 years ago

The autograder runs a simple django server, and is the autograding endpoint for our okpy. The server is set to listen on port 8000.

Autograding process: Okpy sends a POST request to our server telling it the assignment names that need to be graded. An assignment name corresponds to an individual student's submission for a particular homework or project.

Our server spins up a docker pod per assignment. Each pod is responsible for grading one assignment. The pods are on the data100-fall-2019 cluster, and under the data100-staging namespace. We set the parallelism to 200 to prevent the autograder from spinning up too many pods at once.

The pod submits a GET request to the autograder to find out what the name of the assignment it should grade is, then submits a GET request to okpy to get the actual assignment.

Once the pod is done grading the assignment, it submits a POST request back to the autograder, which saves the result in a local postgres db and submits a POST request to Okpy with the grade.

felder commented 4 years ago

I setup the grader VM to run under a service account created just for this. This was fairly straight forward since I deployed the same linux distro as was running before and then simply rsync'd all the data (home dir and postgresql data) from azure. I then granted the permissions necessary so that the service account can get credentials, list pods and namespaces, and create/remove/grab logs from pods.

Additionally the django grading server has been placed behind an apache proxy with SSL enabled.

Tests of the system were successful today, so as far as we are aware the grader is ready for action. Will did have a question about how many pods he could start at once, saying it needs to be more than 1200. He's going to do some tests to see how that pans out.

One last detail that I'd like to check in with @yuvipanda about is how to limit this service account to only being able to manipulate grader-server pods in the data100-staging namespace. As right now while the perms are somewhat limited, it'd be good to limit them further.

yuvipanda commented 4 years ago

Thanks @wwhuang and @felder for making this happen!

In the future, I would like this to run inside the kubernetes cluster. A Pod with a django container, a postgres container, and a persistent volume for whatever data is needed. It can be set up as a JupyterHub service to expose this to the internet + provide HTTPS. This makes permissioning much easier, and removes the hand maintained VM