github-education-resources / teachers

Join the discussion in the GitHub Education Community:
https://education.github.community
187 stars 21 forks source link

Public repositories and cheating #11

Open georgeyiu opened 10 years ago

georgeyiu commented 10 years ago

Having TA'ed at Berkeley for a couple semesters now, the most serious problem we have with GitHub is the amount of cheating that it enables. Every semester, some students take their work from the semester and dump them into public GitHub repositories to show off their work. With hundreds of students in each class, it is inevitable that you get students who look online for past solutions of projects and homework. For almost every project at Berkeley that has existed for at least one semester, you can find some solution code online on GitHub.

For our class, we currently provide a private repository via an educational organization to our students through the entire term, but it doesn't stop them from taking their repository and hosting it publicly after the term. Does anyone (especially GitHub staff) have experience in the best way to solve this problem? Some possibilities I'd imagine include:

augbog commented 10 years ago

I believe in another thread, someone mentioned their thoughts on cheating and how rather than do our best to try and discourage cheating, we should focus on working on ways to encourage students to share information. You can find his comment here. Whether this is the right solution for your scenario, I am not sure. I definitely think this mentality is a step in the right direction though.

geoff-nixon commented 9 years ago

Here's a good problem for an assignment.


Write a program that computes the difference between n different solutions to this assignment, and applies an algorithm or heuristic to determine the extent to which each solution is derivative from others. Set a threshold value of this heuristic to make a determination (along with any other relevant data) as to whether, in your opinion, a solution should be considered plagiarized.

To receive a passing grade, your program:


I think the kids in EECS should be able to handle it.

georgeyiu commented 9 years ago

Many Berkeley classes already use Moss as plagiarism detection, which is much more advanced than any one person is going to come up with as a solution to an assignment or a just-for-fun side project (and far beyond a basic git-diff). This is perhaps the best we can do so far, though there are still many cases/styles of cheating that go uncaught by this text-based processing. There is a bit of work going on in one of our lower division classes to analyze patterns in cheating and to perhaps build a framework for detecting it, but this is just getting started and any useful results would be a long way off.

geoff-nixon commented 9 years ago

I think you might have missed my point there. The idea was to use various combinations of very advanced git-diff features (or, anything else; just somewhere to start) to build a tool like mentioned over time, as comparing and evaluating against previous solutions is a requisite part of the assignment itself. Anyway.

georgeyiu commented 9 years ago

Sorry, I didn't mean to come off as completely disregarding your suggestions. The idea of using versioning to detect cheating is a fantastic idea, and a full-fledged system would likely even involve analyzing individual commits and perhaps using git-diff in very unique ways. Thanks for your ideas!

brock commented 9 years ago

@georgeyiu I'm curious to hear if you have any updates, or changes to the way you are doing things that have addressed this issue.

jaredchandler commented 7 years ago

We have a similar issue at another university.

aaomidi commented 6 years ago

Without this people would probably share code internally and then systems like moss would fail to work. Most of the time, course work is the only thing the student has to offer to potential employers upon graduation.

As a TA at Drexel University, I don't have much control over what happens. However, I've given it a bit of thought. My suggestion: for higher level classes I'd provide students with a overview of an idea and ask them to write the specs for this idea, essentially design the project, and submit that as their first homework. For their second homework they build this spec out. This would allow them to learn about writing specifications for homework, and would essentially enable them to be a better CS student upon graduation.