For token string normalization to work, the code must be formatted beforehand. I did this by first removing the comments with JavaParser (see here) and then formatting the result with Eclipse with a sepcial config using the following command:
Since either dependency is undesireable in JPlag other solutions for these two tasks need to be found to use token string normalization. My suggestions would be to remove the comments by parsing the submission with javac and turning the ast back into a string with the existing toString method (which I probably should have done) and use spotless for the formatting after. Spotless can use the Eclipse config AFAIK.
We could add an option to JPlag to format the code of the submissions, as this could be useful independently of the normalization for manual inspection.
For token string normalization to work, the code must be formatted beforehand. I did this by first removing the comments with JavaParser (see here) and then formatting the result with Eclipse with a sepcial config using the following command:
Since either dependency is undesireable in JPlag other solutions for these two tasks need to be found to use token string normalization. My suggestions would be to remove the comments by parsing the submission with javac and turning the ast back into a string with the existing
toString
method (which I probably should have done) and use spotless for the formatting after. Spotless can use the Eclipse config AFAIK.