jupyter / nbgrader

A system for assigning and grading notebooks
https://nbgrader.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.28k stars 317 forks source link

Merging gradebooks #1424

Open nthiery opened 3 years ago

nthiery commented 3 years ago

Assume that autograding took place at two different locations, one on a subset A of students and the other on a subset B of students. What would be the simplest way to take the two obtained gradebooks and merge them together into a single gradebook holding both the data for A and B? (exporting the grades and merging the csv is not sufficient since I actually need the gradebook for the following manual grading step.

Context: I am experimenting with a workflow where GitLab acts as exchange service (see #1345). In particular, autograding is taken care off by continuous integration on the individual student submissions. Hence I get one gradebook per student, and I would want to merge them together on the instructor's machine before proceeding to the manual grading step.

Thanks!

jhamrick commented 3 years ago

If you can assume that (1) all assignments are identical and (2) the subsets of students are completely independent, that simplifies things a lot (as then you don't need to do any checking to make sure there aren't inconsistencies). In that case, you should be able to loop over students in gradebook B, add them to gradebook A, create a new submission for them for the assignment in gradebook A, and then copy the grades from gradebook B. Something like this:

gb1 = Gradebook('gradebook1.db')
gb2 = Gradebook('gradebook2.db')
for student in gb2.students:
    gb1.add_student(**student.to_dict())
    gb1.add_submission('ps1', student.id)
    old_sub = gb2.find_submission(...)
    for old_grade in old_sub.grades:
        new_grade = gb1.find_grade(...)
        new_grade.auto_score = old_grade.auto_score
        ...
    for old_comment in old_sub.comments:
        # similarly as for grades

If I have time I'll try to actually flesh out a script to do this (it would make for a useful nbgrader db merge command or something!) but for now hopefully this could be enough to get started.

nthiery commented 3 years ago

Thank you so much @jhamrick!

I was precisely starting to dig and slowly making my way, but this template will save me a lot of time (also knowing that it's safe to try something like this!).

Ok, time to go to bed. When I have something working, I'll post it here for sharing and review (hopefully in the coming days).

nthiery commented 3 years ago

For the record, here is my first approximation:

def merge_submission_gradebook(source: Gradebook, target: Gradebook) -> None:
    """
    Merge the students and submissions from the source notebook into the target notebook

    Assumptions:
    - the target gradebook already contains the assignments
    """
    for student in source.students:
        args, kwargs = to_args(student, ['id'])
        target.update_or_create_student(*args, **kwargs)

    for assignment in source.assignments:
        for submission in source.assignment_submissions(assignment.name):
            args, kwargs = to_args(submission, ['student'])
            target.update_or_create_submission(assignment.name, *args, **kwargs)

            for notebook in submission.notebooks:
                for grade in notebook.grades:
                    args, kwargs = to_args(grade, ['name', 'notebook', 'assignment', 'student'])
                    target_grade = target.find_grade(*args)
                    for key, value in kwargs.items():
                        setattr(target_grade, key, value)

                for comment in notebook.comments:
                    args, kwargs = to_args(comment, ['name', 'notebook', 'assignment', 'student'])
                    target_comment = target.find_comment(*args)
                    for key, value in kwargs.items():
                        setattr(target_comment, key, value)
nthiery commented 3 years ago

As an aside: it would be convenient if the update_or_createXXX methods would accept the result of to_dict as is !

nthiery commented 3 years ago

And here is the complementary part, to merge in the assignment information.

def merge_assignment_gradebook(source: Gradebook, target: Gradebook) -> None:
    """
    Merge the gradebook's assignment information into the target gradebook

    Grades are ignored
    """
    assignment, = source.assignments
    target.update_or_create_assignment(assignment.name)

    for notebook in assignment.notebooks:
        args, kwargs = to_args(notebook, ['name'])
        target.update_or_create_notebook(*args, assignment.name, **kwargs)
        for cell in notebook.grade_cells:
            args, kwargs = to_args(cell, ['name', 'notebook', 'assignment'])
            target.update_or_create_grade_cell(*args, **kwargs)
        for cell in notebook.task_cells:
            args, kwargs = to_args(cell, ['name', 'notebook', 'assignment'])
            target.update_or_create_task_cell(*args, cell_type=cell.cell_type, **kwargs)
        for cell in notebook.source_cells:
            args, kwargs = to_args(cell, ['name', 'notebook', 'assignment'])
            target.update_or_create_source_cell(*args, **kwargs)
        for cell in notebook.solution_cells:
            args, kwargs = to_args(cell, ['name', 'notebook', 'assignment'])
            target.update_or_create_solution_cell(*args, **kwargs)
nthiery commented 3 years ago

For the record: this is pretty slow: it takes a bit less than a second per submission. That's ok for my application, but not very reasonable in principle.

nthiery commented 3 years ago

Caveat: Gradebook.close does not trigger a database commit! Hence the last transaction may be lost. It took me a while to figure out why the grades of my last student were not merged properly. It's fixed by forcing a commit before closing the database:

 target.db.commit()
 target.close()

Maybe merge_gradebook should always call target.db.commit() to be on the safe side? Or maybe Gradebook.close should always call a commit?