jplag / JPlag

State-of-the-Art Source Code Plagiarism & Collusion Detection
https://jplag.github.io/JPlag/
GNU General Public License v3.0
1.42k stars 314 forks source link

Report file stability #1190

Open Alberth289346 opened 1 year ago

Alberth289346 commented 1 year ago

The report file is the file produced by the JPlag analysis program, and the report viewer. It is used by all versions of JPlag and all versions of the viewer. That makes the report file a very much shared resource.

In addition, the report file changes in time. newer versions of both the viewer and the JPlag program get new abilities or existing abilities are modified. Such changes need to be transported through the report file which causes semantic and non-semantic changes in the file.

So far, the idea is/was to be that a particular version of the JPlag program decides the content of the report file, and both the report file and the report viewer follow those changes. This works well until you get multiple JPlag versions in use "out there" (some more old than others) and people need or may want to use a report viewer from a different point in time to inspect their results.


The problems that arise when you try to combine a JPag program from a different point in time than the report viewer are:


While much of the above is kicking in open doors, the points that stand out seem to be

Some further points on the data file itself:

This leads to the idea that a report file version has a set of "data features" (a collection of data parts), that changes in time.


And this is how far I got. I am probably miles ahead, took a wrong turn, and am unclear in my explanation but hopefully it can act as a source of inspiration towards a more stable report file.

sebinside commented 1 year ago

Thank you for the comprehensive issue, I'll try to keep my answer as short as possible!

In the old JPlag days (prior to the modernization efforts that started at the end of 2020), this was not an issue since JPlag only generated static HTML files as reports. However, the HTML generation was scattered in different places and so out of date and hard to maintain, we had to replace it. To maximize the independence of the viewing logic and the generation logic, we introduced the Report Viewer with a JSON file to hold the data. While this is a viable way to go, our fault was it to deploy the viewer online. Fun fact, this was not originally planned but happened as a test pilot and then lasted, unfortunately.

TL;DR: To fix this issue once and for all we currently work towards integrating the report viewer with the main JPlag deliverable, starting with #1145 and #1176. Self-hosting a (customized) report viewer will of course still be possible, but not the intended way of using JPlag. This is closer to the original and should end those nasty side effects of data representation in the JSON file which had no benefit at all.

sebinside commented 1 year ago

Side note: More information can always be found in #1000, that will be open until we are done with the major report viewer overhaul, ETA mid 2024.

Alberth289346 commented 1 year ago

Thanks for the update in explaining the future path.

tsaglam commented 9 months ago

As an update, with major release 5.0.0 (hopefully Q1 2024), there will still be a report viewer; however, it should be backward compatible. This release also brings the (for now optional) local mode, which will become the main way to use JPlag in the future.