Kattis / problem-package-format

Kattis problem package format specification
https://www.kattis.com/problem-package-format
9 stars 14 forks source link

Static validator needs some clarification #329

Open niemela opened 2 months ago

niemela commented 2 months ago

Comments from @gkreitz:

  • There is no specification how the validator renders a judgment (I assume exit 42/43?).
  • There is no way I see for a static validator to indicate what languages it supports (I guess you can sort of do something default for non-supported languages, but.. yuck...).

Thoughts?

niemela commented 2 months ago

There is no way I see for a static validator to indicate what languages it supports (I guess you can sort of do something default for non-supported languages, but.. yuck...).

IMO, the static validator must handle all languages allowed by languages in problem.yaml. Maybe we should say that explicitly?

Note that "handle" could mean, "if C++ just accept".

Tagl commented 2 months ago

There is no specification how the validator renders a judgment (I assume exit 42/43?).

This is indeed missing. It should report in the same way as the output validator, with exit codes 42 and 43, using judgemessage.txt, teammessage.txt and score.txt.

gkreitz commented 2 months ago

I think it would also be very good if there was an example of a (useful) static validator. I'm a bit unsure when one would use this feature, and I'm also worrying a bit about how robust implementations will be if people actually do (trying to build something that meaningfully and robustly does static analysis (possibly of several languages) from just being given a directory which contains a mix of user submission and compilation artifacts feels a bit non-trivial to me, but I may be missing something here).

Tagl commented 2 months ago

I'll try to provide one for an old problem we had at RU as soon as I can. This is all from educational view, the programming language was Python in all cases and the teachers wanted to fail for using certain things in most cases, such as preventing use of while, for, recursion, or certain builtin functions. In Python you can kind of do this already with include. Almost always your static validator would be focused on a specific language, not multiple. Having the compiled binaries and artifacts I'm not sure is useful or necessary (it might be), but having the source code is.

gkreitz commented 2 months ago

I'll try to provide one for an old problem we had at RU as soon as I can.

Thank you!

Having the compiled binaries and artifacts I'm not sure is useful or necessary (it might be), but having the source code is.

I didn't phrase myself very well. My point was basically that if I wrote a static analyzer, I would much prefer to have its input only be exactly the files submitted by the user, instead of those plus whatever compilation artifacts may have been produced by the compilation process used by a judging system (which an average problem author has little insight into, and which may easily change).

Tagl commented 2 months ago

A few things I found from Fall 2023, where I helped a few teachers use the problem package format. These are 4 different examples:

Teacher wanted students to specifically use a for loop to find two digit numbers less than $x$ satisfying a specific property. It was possible to use very few if checks (in this case one if check) to get the right output, but the point was to program this exploratory search that was very simple. Is a problem like this silly in general? Yes, I think so. It has a purpose to fulfill, and there are other similar problems where they are less silly. I suggested simply changing the property so the for loop would be much more reasonable than an if, but this was what they wanted. What was desired was for the validation process to catch the easy things to catch, such as "is a for loop used?" to reject submissions instantly if they are breaking the given rule.

Now I tried to use include to perform the static validation. I provided my own main.py and then tried to load/run the student's submitted "main" file. Now there were a few issues with this approach, one being the included main file would overwrite the student's file if they had the same name. Another being that any code in if __name__ == "__main__": blocks would not run when loading the code unless I added some hacks I tried which broke other things. Now this is probably possible, but it is easier to have a separate static validator tool to check the code.

Another kind of static validation that was desired was more code formatting related. Ensuring spaces between operators for example. It's very arguable whether this belongs in a problem package, rather than a language + linter combo available on the judge system. Especially if this is a full formatter / formatting checker such as ruff or black. In some cases though, it is reasonable to do some checks like these in an educational problem when something in particular is being introduced as best practices or popular standards. For example, when type hints are introduced in a Python programming class.

Yet another case was a teacher wanting students to submit a readme.txt alongside their code submission with some information. There a static validator would have been useful too, but I used the output validator and include instead to check for this. There the intent was to check at least whether the file existed in the submission, possibly check the contents a little more.

We also use an assignment in another course that was never put into the format. In that assignment students have a limited number of arithmetic/logical operations to implement each function. So it is a form of code golf that requires a correct output as well.

niemela commented 2 months ago

I didn't phrase myself very well. My point was basically that if I wrote a static analyzer, I would much prefer to have its input only be exactly the files submitted by the user, instead of those plus whatever compilation artifacts may have been produced by the compilation process used by a judging system (which an average problem author has little insight into, and which may easily change).

I was thinking the same, but @pehrsoderman pointed out that e.g. in Java it could be significantly easier to do some kinds of static validation on the .class file.

simonlindholm commented 2 months ago

Now there were a few issues with this approach, one being the included main file would overwrite the student's file if they had the same name. Another being that any code in if __name__ == "__main__": blocks would not run when loading the code unless I added some hacks I tried which broke other things. Now this is probably possible

(It is: for the multipass problems we implemented using include/ for EGOI we solved the first issue by putting main.py in a subdirectory with unpredictable name, and the second by using runpy.run_path. It did take a bit of tinkering to come up with these work-arounds)

Matistjati commented 1 month ago

Further arguments for the existence of static validators: I hosted an April fools contest, where the task was to solve some problem, and the score of a submission was the percentage of the character "a" in the source code (of course, this wasn't specified in the statement, it's April fools). Problem link.

Another use would be a code golf contest. In our code golf contests, we usually make \r\n count as a single character, so that windows users aren't punished, an arbitrary decision that requires static validators.