Submitty / Submitty

Homework Submission, Automated Grading, and TA grading system.
http://submitty.org
BSD 3-Clause "New" or "Revised" License
665 stars 814 forks source link

Initial test suite for Lichen Plagiarism #2574

Closed bmcutler closed 3 years ago

bmcutler commented 6 years ago

In the lichen repo, make a tests top level folder Inside have files: tests/submissions/python/student_a/1/foo.py tests/submissions/python/student_a/2/foo.py tests/submissions/python/student_b/1/foo.py

~10 'submissions' for each of our supported languages With clear examples of common, unique, and matching (plagiarism) And then make a directory for the expected output for each tests/results/python/ranking/... tests/results/python/concatenated/.. And then a simple script to re-run the plagiarism scripts (no php/view, just files) and compare/diff with the expected output, file-by-file. The script passes if everything matches. Then we can hook this up as a travis regression test on the Lichen repository.
tushargr commented 6 years ago

@bmcutler "common" match type is only possible when we have atleast 20 users with same code/text. So I guess I need to 20 students submissions for each language. Made it for plaintext and tested it on interface. Shows matches as expected.

For python, added common and suspicious code. But unique code is difficult as there are 20 students. Should I skip unique code?

bmcutler commented 6 years ago

Is the number 20 hardcoded? Probably. It should be connected to the "Threshold to be considered plagiarism" from the configuration. To make it easier to write these tests, we should set that number to 2, or 3, or 5, as appropriate to test the functionality. In a real course, the instructor will likely set this number to 5 or 10 or even larger for a class of 100 or more.