Closed Kobzol closed 2 months ago
Hi @Kobzol, what you request is definitely possible, but I suggest a better alternative:
The dolos-web
module containing the code for the Web UI actually has the HTML/CSS/JS files needed to visualize every report based on the CSV files that are generated for each analysis. So where MOSS creates separate HTML-files for each report, the HTML/CSS/JS files that Dolos uses stay the same.
The command dolos serve
is hence nothing more than statically hosting the contents of the report together with the static files of the dolos-web
package. This is currently handled by a <100LOC class server.ts
, but the important part is this:
https://github.com/dodona-edu/dolos/blob/d39dfc984cdb964589869bd16ab69a372f3acb73/cli/src/cli/server.ts#L50-L61
/data
it is a report CSV-file, so look in the report directoryThis is should be easy to achieve with Django as well and would save you making a copy of the identical ~10MB web UI files for each report. It also has the added benefit that if we add new features to the Web UI, you could just update to the latest version and all previously generated reports can use these new features.
Let me know if this approach would work for you.
You don't need to build the dolos-web
UI yourself BTW, we publish the prebuilt HTML/CSS/JS files on NPM: https://www.npmjs.com/package/@dodona/dolos-web?activeTab=code
So you would only need to download the package, extract it and statically host the files in the dist/
folder.
That sounds great, thank you! So if I understand it correctly, when the SPA is started, it always tries to load /data/[metadata|files|kgrams|pairs].csv
, and then displays whatever is returned to it by the backend?
If I wanted to keep your original source code without doing any modifications to it, I'd need to somehow disambiguate different checking results (e.g. using /data/...csv?id=XYZ
or something like that). Can that be configured somehow using the current code, or do you think that I'd need to fork and modify the frontend to change this?
That sounds great, thank you! So if I understand it correctly, when the SPA is started, it always tries to load
/data/[metadata|files|kgrams|pairs].csv
, and then displays whatever is returned to it by the backend?If I wanted to keep your original source code without doing any modifications to it, I'd need to somehow disambiguate different checking results (e.g. using
/data/...csv?id=XYZ
or something like that). Can that be configured somehow using the current code, or do you think that I'd need to fork and modify the frontend to change this?
It actually loads /data/*.csv
relative to the index.html
file. So you could host the Dolos-web files on /reports/*/index.html
be the same and let /reports/{id}/data/*.csv
return the CSV-files belonging to that report-id. So there is no need to change anything.
For the Dolos server we actually have a special mode to build the frontend to upload, list and go to the reports. But we host the reports in the way described above.
Ah, cool, I thought that it's hardcoded to load from the root /
, but if it's relative, than that should indeed be ideal for our use-case. Thanks a lot for explaining this to me! :) I will try to integrate it within our system and let you know how it went.
We're trying to make Dolos as flexible as possible to make integrations like this possible, so definitely get in touch to let us know how it goes. We're looking forward to how you would be using Dolos.
Thanks to your hints, I was able to integrate Dolps in our system quite easily, thank you!
Ideally, we'd need to have multi-file submits per student, but for now I simulate it with just concatenating all files together, we'll see how that works.
We're still keeping MOSS for now, as it returns different results and has a bit more intuitive pair visualization (with the differently colored section per plagiarized block), but it's very nice to have an alternative (that can't timeout because the MOSS server is down, lol).
Glad to hear the integration worked out!
Supporting multiple files per submission is indeed not possible. We solve it as well by concatenating multiple files. We do have an open issue (#1121) but it is currently not our priority.
As for the pair visualization: we deliberately used only one color to keep the visual complexity in check. You used to be able to click on a fragment such that it would highlight the matching parts, but this has broken at one point. There are some other improvements possible with the comparison as well.
We welcome contributions if you want to help us out with this part. Let me know if that is the case and I can write down the changes required for the different features.
In any case, good luck with using Dolos and we welcome any additional feedback that you have :blush:
You used to be able to click on a fragment such that it would highlight the matching parts, but this has broken at one point.
This seems to work for me, so maybe it got unbroken in the meantime :laughing:
I can help contributing some changes, it might take me some time, but if you can write down some hints, that would help a lot, of course. The feature that I would appreciate the most is probably https://github.com/dodona-edu/dolos/issues/1121.
Anyway, thanks a lot for a great tool!
The best bugs are the ones that fix themselves :sweat_smile:
I have extended #1121 with some initial pointers, but if you want I could set up a video call to walk you through the project.
I will also make a separate issue for clarifying the matched fragments with some ideas I have.
Hi, thanks for this great tool! I hope that it will be able to replace MOSS in our code submission tool. We have been using MOSS for a few years, but often we have issues with it (primarily because of the fact that it is only available as a remote API that is often slow or outright offline).
Is your feature request related to a problem? Please describe. We run code plagiarism checks in an automated manner on code submissions from many (several hundreds of) students, so we want to have plagiarism checking integrated directly within our submission website, to avoid the need for teachers to manually use a CLI tool (or an external website) to see the plagiarism check results. With MOSS, we use their API, which generates a set of self-contained HTML files, which we then serve and show to teachers directly through our website.
However, I haven't found a similar feature in Dolos. It can visualize the results in a website (which looks totally awesome, and I would love to show it to teachers using our tool!), however, it needs the
dolos serve
command that actually serves the website. This is difficult to combine with our existing (Django) web app, as we would somehow need to keep a separate persistent Dolos server running per plagiarism checking result, which is not really feasible. Even if it was possible to use just a single Dolos server for this, it would be a bit complex, and we would need to have a way to send a specific result to the Dolos server through the URL, to have the ability to display different results in our website.Describe the solution you'd like Ideally, I would like to have a way to export the Dolos website to a self-contained directory that could just be opened in a browser (without any active web server) and it would "just work" :) Since the web is mostly a SPA, this probably shouldn't be that difficult, I hope.
So ideally, I'd like to have a command like
dolos export
, which would act in the same way asdolos serve
, but it would just generate a directory with HTML/JS/CSS files, rather than starting a web server. Or, alternatively, there could be some output format likedolos -f web-dir
, that would do the same thing.Describe alternatives you've considered I could create my own web visualization (integrated within our website) out of the generated CSV files, but this is obviously a lot of work and I would be duplicating what the Dolos web already does. Alternatively, I could open the Dolos website programmatically and then somehow "snapshot" it (using a headless browser?) to generate the self-contained direcetory, but that would be a very complex process.
Let me know what do you think about this idea, and how complex do you think it would be. I can try to send a PR, if you think that it's feasible and if you can guide me to where should I start taking a look.