dodona-edu / dodona

🧑‍💻 Learn to code for secondary and higher education
https://dodona.be
MIT License
68 stars 22 forks source link

Grading in Dodona #2401

Closed niknetniko closed 3 years ago

niknetniko commented 3 years ago

Thoughts/proposal for a first version:

Terminology

Flow

The intended flow of grading would be like this:

  1. The evaluator creates an evaluation for a series, as is done now.
  2. After selecting the relevant exercises and users, there is a third step. In this step, the evaluator must specify the ScoreTemplates per exercise. This can be very simple (e.g. x/10 for each exercise), but a more detailed tree is possible. Automating this is difficult, because an exercise has no notion of the result that will be produced by the judge. We could allow a grading suggestion in the config of the exercise though (e.g. if you author an exercise with two functions, you can suggest splitting the grading into two grade items).
  3. When actually doing the evaluation, the evaluator can only mark a submission as evaluated if each leaf in the ScoreTemplate tree is filled in. We'll need to think were we put this: above or below the submission code, or in a new tab (next to the "code" tab).
  4. When the evaluation is released, the students will be able to see their grades. (Again, perhaps the total for a submission can be above the code, while the ScoreTemplates are displayed in a new tab.)

Future

Things that are not for this issue, but can be done later if we determine the basic version is deemed OK

LTI integration

In LTI, there are some different concepts:

Some problems need investigating:

Penalties

Another useful addition would be the concept of penalties: annotations with an assigned score that is deducted from the total score for an exercise.

The main difference with the scenario where the evaluator fills in the scores is that here, each student starts with the maximum available points, and the annotations are subtracted from that score. (e.g. 10/10 for a function, but -1 penalty because of "break/continue" => score of 9/10).

This will need some thought on how both modes can be integrated together, and how the penalties act when the evaluator wishes to override the final score. One possibility is allowing a custom start score (e.g. the student starts with 9/10, from which penalties are deducted). This might feel arbitrary to students though (e.g. why do I have 7/10, when I only have two penalties of -1 each?). Another option is to disallow mixing both modes: either the student starts with 0 and the evaluator fills in the score, or the students starts with the max score and the evaluator must use penalties to deduct points.

pdawyndt commented 3 years ago

First reflections after reading the proposal

niknetniko commented 3 years ago

why not ScoreItem instead of GradeItem (or Grade instead of Score) to make it more consistent?

This was because with "ScoreItem" and "Score" it isn't obvious to me which is which, but I agree that "GradeItem" doesn't solve this. Perhaps a better name would "ScoreBlueprint" or "ScoreDistribution" (from 'puntenverdeling').

do we need a tree to start with? just some ScoreItems and an (automatically computed) total score might be sufficient for a first iteration

We don't really need the tree, but:

(Of course, this tree structure is internal to Dodona, we don't have to expose this: the UI can look like some ScoreItems per exercise, with a computed total)

I guess the scoring system will initially be fixed after creating an evaluation? afterwards we might consider what the options are to make changes: what should be done with scores that are already assigned?

A "simple" solution would be to unmark all completed submissions as done, meaning they would need to be evaluated again, which is probably OK, since I assume if you change the scoring system, you'll need to update the scores

pdawyndt commented 3 years ago

The latter might result in a lot of work being lost. We might start with a fixed scoring scheme and then later relax some changes. There are the obvious changes that don't really impact the scores given:

Rather than "throwing away" scores given before changing the score system, we might keep the scores given, but undo the "completed" status of their review so they need to be reviewed again (with the old scores still in place). Here's some actions that might be considered:

pdawyndt commented 3 years ago

@niknetniko these might be interesting to read through for inspiration (models and terminology):