Distinguish output validation from `.ans`-validation

thorehusfeldt commented 11 months ago

BAPC includes an excellent routine for checking the validity of .ans-files with the same case as .in-files.

This does, however, lead to two problems

incompatibility with “the specification” and other tools, which neither expect or allow the presence of, say, an extra output validator when validation:default.
inaccessible documentation and workflows; I can only speak for myself, but it took me a long time to wrap my head around the logic now sketched in https://github.com/RagnarGrootKoerkamp/BAPCtools/blob/master/doc/validation.md#output-validation

I suggest to conceptually distinguish between the validation of data/**/.ans (which I refer to as default answers, here just answers) and the output of a submission (which I refer to as submission output, here just output). The former is a constraint on the problem designer, the latter is a correctness requirement on the solver’s submission.

There are many ways of cutting the cake, but to get the ball rolling, I propose

<problem_name>/answer_validators

The semantics are like input_validators. (In particular, there can be more than one program in that directory, all of which have to pass on all .ans files.) A validator in this directory might also have access to testcase.in and testcase.ans.

RagnarGrootKoerkamp commented 11 months ago

I really like this idea. It would remove a lot of confusion and clearly indicate that this is something outside the spec.

So we could easily already accept answer_validators and map that to what is currently internally called output format validators (but that can then be renamed to just answer validators :) ).

More tricky is what happens with existing output_validators/:

For default-output-validation problems: we can suggest (log) to rename this to answer_validators, and continue to use them as usual.
For custom-output-validation problems, the output validator is currently used for two things:
1. Validate team output, as in the spec.
2. Validate .ans files, outside the spec.
The existing problem with this is that while almost always the .ans file is simply a valid answer itself, there are very ocasional cases where the .ans contains some specific metadata used for validation that is not actually a valid answer itself. So, what we can do is to simply drop this second case and only validate team output.

I think not much is lost with this: Probably in those cases the .ans isn't even read by the output validator in the first place so it's syntax doesn't matter.

The typical case is something like find the maximum of X. Print this maximum score, followed by a specific solution. Then the output validator reads the first line of the .ans to check for the right score and the rest is simply ignored, so the .ans really doesn't need to be validated fully. If wanted a separate answer validator could be created to test that the .ans indeed starts with a score but that's not really so much needed usually I'd say.

One thing I would change maybe is that while currently an output format validator is called as

output_format_validator /path/to/testcase.in /path/to/testcase.ans < /path/to/testcase.ans

it makes more sense for an answer_validator to be called as

answer_validator /path/to/testcase.in < /path/to/testcase.ans

deduplicating the .ans input.

RagnarGrootKoerkamp commented 11 months ago

CC @mpsijm @mzuenni

mpsijm commented 11 months ago

Sounds like a good idea to separate out-of-spec output format validation from the in-spec custom output_validator :slightly_smiling_face:

Regarding the naming, why not output_format_validators, if that's already how we call it internally? It's also consistent with what used to be called input_format_validators in the spec (this is now simply called input_validators, plus I like the fact that this keeps the directories together in alphabetical order. :innocent:

RagnarGrootKoerkamp commented 11 months ago

I agree that having them close in alphabetical order is nice, but really we want to validate the .ans files, not the format of whatever the teams output. And also input format validators doesn't actually occur in the spec anymore I think, so it's nice to have symmetry between testcase input, answer, output and {input,answer,output}_validators/.

mpsijm commented 11 months ago

All right, makes sense! As long as it's clear that there is a distinction in the code and the documentation between "output" and "answer" (because I expect that new people might think they mean the same), I'm fine with answer_validators :slightly_smiling_face:

thorehusfeldt commented 11 months ago

For precision, consider default_answer_validators or ans_validators. I don’t like the former (even though default answer is exactly what the validate), because, there is also a default output validator, which is something else. So people who confuse output and answer would be even more confused.

Output and ans are harder to confuse. ans_file_validators is even more difficult to misunderstand.

I remain weakly in favour of answer_validators.

(Two days later: clarity should trump aesthetics, so after some soul searching I am now mildly in favour of a directory name called ans_file_validators and a concept called “.ans-file validation”. The latter is mildly preferable to “default answer validation” because of the possible confusion with what the “default output validator” does.)

mzuenni commented 8 months ago

How are answer_validator invoked now? @RagnarGrootKoerkamp proposed answer_validator /path/to/testcase.in < /path/to/testcase.ans instead of the previous output_format_validator /path/to/testcase.in /path/to/testcase.ans < /path/to/testcase.ans

However, the second option makes it easier to reuse the output_validator as answer_validator (like bapctools did before)?

thorehusfeldt commented 8 months ago

Believe it or not, the documentation in my PR is quite up to date: https://github.com/RagnarGrootKoerkamp/BAPCtools/blob/b30fbc4150e0cbec9855325d741ed370fe44e455/doc/validation.md

It is as you say. In summary,

inval [flags] < case.in
ansval case.in [flags] < case.ans
outval case.in case.ans tmpdir [flags] < case.ans
submission < case.in | outval case.in case.ans tmpdir [flags]

bt validate answer-validates all .ans and also subjects them to the output validator using the third format above. In particular, the (singular) output validator is always run against all .ans files during validation.

mzuenni commented 8 months ago

And in the third case the case case_sensitive and space_change_sensitive flags are still provided? (as far as i can see that is not mentioned in the doc but that was the (undocumented) behavior before)

Side Note: the point Interactive problems starts with jjj

thorehusfeldt commented 8 months ago

Currently we aren’t sending case_sensitive and space_change_sensitive. @RagnarGrootKoerkamp and I did discuss this, and didn’t really discover very strong arguments either way.

Note that if validation: default then it makes no sense anyway (the .ans file is compared against itself).

So the interesting case is when Alice has written a custom output validator. But then the BAPC High Wardens of Problem Quality want to incentivise her to write an explicit answer validator anyway instead of falling back on a tempting (and effort-saving) default.

I have no opinion.

RagnarGrootKoerkamp / BAPCtools

Distinguish output validation from `.ans`-validation #324