As part of a standalone plagiarism scanner based on the VPL ruleset, I noticed that a comparison between two files yielded different results in the standalone version compared to the Moodle version.
After thorough investigation, I concluded my code was fine, so I started debugging VPL itself. Upon doing so, I noticed that the sintax_normalize of the C, Python and Java versions all have structures similar to this:
While initially this looks okay, turns out $token is added as a reference to $ret, meaning the second $token->value assignment also manipulates the one that had been added the line before.
In this example, the resulting $ret array actually contains two +-tokens, rather than the expected =- and +-token.
After replacing the above structures with this:
$ret [] = new vpl_token( vpl_token_type::OPERATOR, '=', $token->line);
$ret [] = new vpl_token( vpl_token_type::OPERATOR, '+', $token->line);
break;
I can confirm that VPL now yields the exact same results as my standalone version.
I'll make a pull request to fix this for all offending pieces of code.
As part of a standalone plagiarism scanner based on the VPL ruleset, I noticed that a comparison between two files yielded different results in the standalone version compared to the Moodle version.
After thorough investigation, I concluded my code was fine, so I started debugging VPL itself. Upon doing so, I noticed that the
sintax_normalize
of the C, Python and Java versions all have structures similar to this:While initially this looks okay, turns out
$token
is added as a reference to$ret
, meaning the second$token->value
assignment also manipulates the one that had been added the line before.In this example, the resulting
$ret
array actually contains two+
-tokens, rather than the expected=
- and+
-token.After replacing the above structures with this:
I can confirm that VPL now yields the exact same results as my standalone version.
I'll make a pull request to fix this for all offending pieces of code.