jplag / JPlag

State-of-the-Art Software Plagiarism & Collusion Detection
https://jplag.github.io/JPlag/
GNU General Public License v3.0
1.28k stars 304 forks source link

Develop an end-to-end testing strategy #193

Closed tsaglam closed 1 year ago

tsaglam commented 2 years ago

We currently have no way of telling whether a change made to JPlag affects the quality of plagiarism detection. This makes refactoring code hard because we never know when a PR has unintended consequences. The reason for this is that JPlag has not enough test cases. We thus need a carefully designed test framework, that uses different data sets and runs JPlag with different configuration options. The JPlag result object is then used to check if JPlag produces the expected results. Of course, when we first add the test cases, the expected results are the current results JPlag produces for these inputs.

The following initial requirements have been identified:

Related research works:

Steps to do:

tsaglam commented 2 years ago

The technical report on JPlag contains different plagiarism classes, this might also be a starting point.

tsaglam commented 2 years ago

@SuyDesignz see this paper for many types of plagiarism patterns: https://ieeexplore.ieee.org/abstract/document/7910274 I also overhauled the issue description to make things clearer.

SuyDesignz commented 2 years ago

@jplag/studdev First of all, I would like to explain the goal of the End-To-End testing strategy. Changes to plagiarism detection in JPlage will be tested. The main reason is to ensure consistent results when adapting the application. Likewise I try to find out, with purposeful tests, in which recognition phase the change finds itself (optional and still in discussion).

The test cases are created based on elaborations in the field of plagiarism and the possibility of avoiding its detection. The papers used for this purpose are: "Detecting Source Code Plagiarism on Introductory Programming Course Assignments Using a Bytecode Approach - Oscar Karnalim" https://ieeexplore.ieee.org/abstract/document/7910274 and "Detecting Disguised Plagiarism - Hatem A. Mahmoud" https://arxiv.org/abs/1711.02149

These elaborations provide basic ideas on how a modification of the plagiarized source code can look like or be adapted. These code adaptations refer to a wide range of changes starting from adding/removing comments to architectural changes in the deliverables.

I have limited myself to the following examples: (points 1 - 4 are the levels of recognition given above and will be deepened in the further examples. A mixture of the points is also provided. Likewise the division of the changes into the phases will be further evaluated and possibly adjusted. The examples are from the articles mentioned before and are in each case on the pages Detecting Source Code Plagiarism [...] -> p3 and Detecting Disguised Plagiarism -> p4)

  1. Inserting comments or empty lines (normalization level)
  2. Changing variable names or function names (normalization level)
  3. Insertion of unnecessary or changed code lines (token generation)
  4. Changing the program flow (token generation) (statments and functions must be independent from each other) 4.1 Variable decleration at the beginning of the program (Detecting Source Code Plagiarism [...]) 4.2 Combining declerations of variables (Detecting Source Code Plagiarism [...]) 4.3 Reuse of the same variable for other functions (Detecting Source Code Plagiarism [...])
  5. Changing control structures (Detecting Disguised Plagiarism) 5.1 for(...) to while(...) 5.2 if(...) to switch-case
  6. Modification of expressions (Detecting Disguised Plagiarism and Detecting Source Code Plagiarism [...]) 6.1 (X < Y) to !(X >= Y) and ++x to x = x + 1
  7. Splitting and merging statements (Detecting Disguised Plagiarism) 7.1 x = getSomeValue(); y = x- z; to y = (getSomeValue() - Z;
  8. Inserting unnecessary casts (Detecting Source Code Plagiarism [...])

Examples for test cases were used by the hs-Karlsruhe http://www.home.hs-karlsruhe.de/~pach0003/informatik_1/aufgaben/java.html and are provided with the above mentioned adaptations. The examples are iteratively integrated into the test framework in order to evaluate the use of the adaptations and to make adjustments if necessary. adjustments if necessary. Further examples can follow and should be based on typical university submissions.

  1. Sorting with and without recursion
  2. Calculator or conversion of units (celsius to fahrenheit, ...)
  3. Calculation of the depth of a binary tree
  4. Calculation of cross sums
tsaglam commented 2 years ago

Sounds good to me so far! I think it is important that we think about where we persist the design rationale behind the dataset. E.g. if we have a plagiarised test submission based on certain levels, we need to document that either as a comment in the submission or maybe a readme. But this is something we can also discuss in a future meeting.

tsaglam commented 1 year ago

Closed by #548 and #551.