sahilbansal17 / Competitive_Coding

This repository contains some useful codes, techniques, algorithms and problem solutions helpful in Competitive Coding.
GNU General Public License v3.0
407 stars 308 forks source link

Plagiarism checker on Pull Requests #504

Closed sahilbansal17 closed 4 years ago

sahilbansal17 commented 4 years ago

This repository is getting a lot of PRs where the code is being copied from Geeksforgeeks or other platforms.

It would be nice if we can use some Plagiarism checker API or make a custom tool to see if the PR's code is copied from somewhere by checking some of the keywords along with the code.

For. eg. a code of binary search can be compared with similar codes on GFG, HackerEarth, etc. and a plagiarism score can be assigned to the PR.

puthusseri commented 4 years ago

This repository is getting a lot of PRs where the code is being copied from Geeksforgeeks or other platforms.

It would be nice if we can use some Plagiarism checker API or make a custom tool to see if the PR's code is copied from somewhere by checking some of the keywords along with the code.

For. eg. a code of binary search can be compared with similar codes on GFG, HackerEarth, etc. and a plagiarism score can be assigned to the PR.

Could you please explain how can we implement this concept.

sahilbansal17 commented 4 years ago

This repository is getting a lot of PRs where the code is being copied from Geeksforgeeks or other platforms. It would be nice if we can use some Plagiarism checker API or make a custom tool to see if the PR's code is copied from somewhere by checking some of the keywords along with the code. For. eg. a code of binary search can be compared with similar codes on GFG, HackerEarth, etc. and a plagiarism score can be assigned to the PR.

Could you please explain how can we implement this concept.

@puthusseri I am also not completely sure about it. We need to see if there are any Open APIs related to this that can help.

harshraj22 commented 4 years ago

This might help, though I am not sure how to automate this for pr (but for plagarisms we can have codes from popular websites stored locally rather than scrapping every time a pr is made and then checking for plagarism). @sahilbansal17 let me know if you have some suggestions on how to automate this thing for pr. The issue seems very interesting and I want to work on it.