Implement Public, Final, and Token Tests.

Next to the API Client, the biggest blocker that I've observed is that students can get pretty frustrated with Galah because they are using Galah to test their code for them. This is not what Galah was intended for and will not teach students good practices.

The solution to this has been known for awhile but has never been implemented. I think it's very important that we implement it now, before the problem becomes too dire.

We will follow the Marmoset Project's example and implement three kinds of tests.

Public tests will be what the tests are like right now, run every time the student submits and the results are shown to the students.

Final tests will be tests that are only shown to the student once the due date passes (this can be implemented by having sisyphus check if any assignments need this done every minute or so). The final tests should be run every time the user submits however, they just won't be displayed to the user.

Token tests will be tests that students can use tokens to see the results of. We'll want to mimic Marmoset here, because Professor Pugh has clearly thought hard on the best way to do this and it sounds perfect. We'll want to attribute his work somehow as well (maybe a special thanks in the CONTRIBUTORS file?).

If the submission passes all of the public test cases, the student is given an option to perform a release test of the submission. Perhaps this is the poker game project, and the student performs a release test. They might be told:

There are 12 release tests. This submission passed 7 release tests, and failed 5. The names of the first two failed tests are "full_house" and "4_of_a_kind" (the names of only the first two failed release tests are revealed).

Now, a student can think "oh, I think I know what I did wrong," change their code and resubmit. But performing a release test requires using a release token. Students are given some number of tokens (typically 2 or 3) and they regenerate 24 hours after being used. This has many repercussions.

Students have an incentive to start early: the earlier they start, the more opportunities they have to perform release tests.

Students are told that when they learn that they failed a release test, they shouldn't try to first fix their code. Instead, that should try to write a test case that replicates that failure, so that when they next perform a release test they have some confidence that they will actually pass the release test.

If students make an incorrect assumption that causes them to fail many of the instructor's test cases, they find out before the project deadline and have a chance to ask questions and try to fix their code.

When it gets down to the last day and the last two release tokens, it replicates much of the pressure that real software developers feel to ensure the qualify of their code, and helps them develop good software development skills.

All tests are run as soon as the project is submitted, so instructors can see if students are having particular problems with a test case. That might be because project specification is unclear, the test case doesn't completely match the project specification, the test case particularly challenging, or the material required to handle that case hasn't be covered in lecture yet.

All the details of the test case can be revealed immediately after the project deadline, so students get full feedback on the project before moving on to the next assignment.

He calls these tests release tests but I've never been a fan of that term though I think I understand the motivation. I think Token tests is a little clearer but we may want to spend at least a little while thinking about what we should call them.

ucrcsedept / galah

Implement Public, Final, and Token Tests. #319