Closed thorehusfeldt closed 8 months ago
I really like this idea. It would remove a lot of confusion and clearly indicate that this is something outside the spec.
So we could easily already accept answer_validators
and map that to what is currently internally called output format validators
(but that can then be renamed to just answer validators
:) ).
More tricky is what happens with existing output_validators/
:
log
) to rename this to answer_validators
, and continue to use them as usual.For custom-output-validation problems, the output validator is currently used for two things:
.ans
files, outside the spec. The existing problem with this is that while almost always the .ans
file is simply a valid answer itself, there are very ocasional cases where the .ans
contains some specific metadata used for validation that is not actually a valid answer itself.
So, what we can do is to simply drop this second case and only validate team output.
I think not much is lost with this: Probably in those cases the .ans
isn't even read by the output validator in the first place so it's syntax doesn't matter.
The typical case is something like find the maximum of X. Print this maximum score, followed by a specific solution.
Then the output validator reads the first line of the .ans
to check for the right score and the rest is simply ignored, so the .ans
really doesn't need to be validated fully. If wanted a separate answer validator
could be created to test that the .ans
indeed starts with a score but that's not really so much needed usually I'd say.
One thing I would change maybe is that while currently an output format validator
is called as
output_format_validator /path/to/testcase.in /path/to/testcase.ans < /path/to/testcase.ans
it makes more sense for an answer_validator
to be called as
answer_validator /path/to/testcase.in < /path/to/testcase.ans
deduplicating the .ans
input.
CC @mpsijm @mzuenni
Sounds like a good idea to separate out-of-spec output format validation from the in-spec custom output_validator
:slightly_smiling_face:
Regarding the naming, why not output_format_validators
, if that's already how we call it internally? It's also consistent with what used to be called input_format_validators
in the spec (this is now simply called input_validators
, plus I like the fact that this keeps the directories together in alphabetical order. :innocent:
I agree that having them close in alphabetical order is nice, but really we want to validate the .ans
files, not the format of whatever the teams output. And also input format validators
doesn't actually occur in the spec anymore I think, so it's nice to have symmetry between testcase input
, answer
, output
and {input,answer,output}_validators/
.
All right, makes sense! As long as it's clear that there is a distinction in the code and the documentation between "output" and "answer" (because I expect that new people might think they mean the same), I'm fine with answer_validators
:slightly_smiling_face:
For precision, consider default_answer_validators
or ans_validators
. I don’t like the former (even though default answer is exactly what the validate), because, there is also a default output validator, which is something else. So people who confuse output and answer would be even more confused.
Output and ans are harder to confuse. ans_file_validators
is even more difficult to misunderstand.
I remain weakly in favour of answer_validators
.
(Two days later: clarity should trump aesthetics, so after some soul searching I am now mildly in favour of a directory name called ans_file_validators
and a concept called “.ans
-file validation”. The latter is mildly preferable to “default answer validation” because of the possible confusion with what the “default output validator” does.)
How are answer_validator
invoked now?
@RagnarGrootKoerkamp proposed
answer_validator /path/to/testcase.in < /path/to/testcase.ans
instead of the previous
output_format_validator /path/to/testcase.in /path/to/testcase.ans < /path/to/testcase.ans
However, the second option makes it easier to reuse the output_validator
as answer_validator
(like bapctools did before)?
Believe it or not, the documentation in my PR is quite up to date: https://github.com/RagnarGrootKoerkamp/BAPCtools/blob/b30fbc4150e0cbec9855325d741ed370fe44e455/doc/validation.md
It is as you say. In summary,
inval [flags] < case.in
ansval case.in [flags] < case.ans
outval case.in case.ans tmpdir [flags] < case.ans
submission < case.in | outval case.in case.ans tmpdir [flags]
bt validate
answer-validates all .ans
and also subjects them to the output validator using the third format above. In particular, the (singular) output validator is always run against all .ans
files during validation.
And in the third case the case case_sensitive
and space_change_sensitive
flags are still provided? (as far as i can see that is not mentioned in the doc but that was the (undocumented) behavior before)
Side Note:
the point Interactive problems
starts with jjj
Currently we aren’t sending case_sensitive
and space_change_sensitive
. @RagnarGrootKoerkamp and I did discuss this, and didn’t really discover very strong arguments either way.
Note that if validation: default
then it makes no sense anyway (the .ans
file is compared against itself).
So the interesting case is when Alice has written a custom
output validator. But then the BAPC High Wardens of Problem Quality want to incentivise her to write an explicit answer validator anyway instead of falling back on a tempting (and effort-saving) default.
I have no opinion.
BAPC includes an excellent routine for checking the validity of
.ans
-files with the same case as.in
-files.This does, however, lead to two problems
validation:default
.I suggest to conceptually distinguish between the validation of
data/**/.ans
(which I refer to as default answers, here just answers) and the output of a submission (which I refer to as submission output, here just output). The former is a constraint on the problem designer, the latter is a correctness requirement on the solver’s submission.There are many ways of cutting the cake, but to get the ball rolling, I propose
The semantics are like
input_validators
. (In particular, there can be more than one program in that directory, all of which have to pass on all.ans
files.) A validator in this directory might also have access totestcase.in
andtestcase.ans
.