cms-dev / cms

Contest Management System
http://cms-dev.github.io/
GNU Affero General Public License v3.0
886 stars 360 forks source link

Support plagiarism detection #764

Open lw opened 7 years ago

lw commented 7 years ago

We may want to provide to contest administrators the ability to determine whether some submissions are too similar to some public code or to each other. This is probably most pertinent to online contests or classroom use rather than onsite contests, which makes it a bit outside of the main scope of CMS.

I don't think implementing this ourselves is the best way to go. I believe there already exist such tools, with sophisticated algorithms and large corpora of sources. We should provide a way for CMS to interface with them.

I wouldn't be surprised if this issue had arisen before and I would love to hear from administrators that faced it how they addressed it.

wil93 commented 7 years ago

I just implemented cmsExportSubmission for this :)

After exporting everything, I run the submissions folder with jplag

Il gio 18 mag 2017, 13:13 Luca Wehrstedt notifications@github.com ha scritto:

We may want to provide to contest administrators the ability to determine whether some submissions are too similar to some public code or to each other. This is probably most pertinent to online contests or classroom use rather than onsite contests, which makes it a bit outside of the main scope of CMS.

I don't think implementing this ourselves is the best way to go. I believe there already exist such tools, with sophisticated algorithms and large corpora of sources. We should provide a way for CMS to interface with them.

I wouldn't be surprised if this issue had arisen before and I would love to hear from administrators that faced it how they addressed it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cms-dev/cms/issues/764, or mute the thread https://github.com/notifications/unsubscribe-auth/ABOc8Tkjl3JlHWXSVmQP-iov2qTGgUHTks5r7CfhgaJpZM4NfEm5 .

CristianCantoro commented 6 years ago

I have prepared a repo with the scripts we use at the University of Trento for that: cms_check-plagiarism.

The script check_plagiarism.sh works as follows:

You can read more in the section "How it works" of the README.md.