Closed thorehusfeldt closed 5 months ago
Haven't thought this through, but what are the drawbacks of simply calling the answer validator with a path to an empty .in file when it's not provided? (This assumes the user only does this when indeed the answer validator does not read stdin.)
Another question: suppose there are 2 answer validators, of which only one reads the .in. (How) can we distinguish these?
There could be two answer validators, one reads the in input file (maybe it’s the custom output validator), the other doesn’t (it just checks output format, much like what a .ctd
-based validator would do.) That’s exactly the issue.
bt validate
needs to understand which validators to call for the pseudotestcase bad_format.ans
. In particular, if it calls the custom output validator, that will now crash (because it tries to open a non-existing file). Of course, we now do distinguish exit code 1
from 43
, so this would actually work, but I dislike putting semantics on crashing programmes for a normal use-case. So validate.py
needs to be able so see from the outside which answer validators it can safely call.
After even more thinking: Maybe @RagnarGrootKoerkamp is (almost) right, and we can just pass a non-existing file as the first argument. (Because we need to distinguish the behaviour of a rejecting validator that rejects because the input is empty from a validator that rejects because the answer file is wrong no matter what the input is. Both return 43
.)
The semantics ia as follows: pseudotestcase like
not_an_int:
ans: zero
negative:
ans: "-1"
out_of_bounds:
ans: "100"
go through answer validation using /path/to/unopenable
as the input. In particular, the invocations are
ans_validator /path/to/unopenable < testcase.ans
and (provided the above all passed) even
output_validator /path/to/unopenable testcase.ans < testcase.ans
But typically, an answer validator (such as a lowly .ctd
-validator) has already rejected before the output validator got fired up.
A problem author that wants to support the above invalid testcases should write an answer validator that makes sure to check standard input as much as possible before opening the input file. Like this (for a problem with input N
and whose output consists of N
may integers):
line = input() # this is the .ans
if not re.match("\d+( \d+)*\n"); # syntax-check
fail("integers expected")
if not sys.stdin.readline() == "":
fail("extra output")
for token in line.split():
check_int(token, lo=0, hi=100)
# Only now should we open input
N = int(open(sys.args[1]).readline())
if not len(line.split()) == N:
fail(f"{N} many tokens expected")
An author who doesn’t want this constraint imposed on their answer validator wouldn’t be able to invalidate .in
-free invalid testcases; basically the situation we are in today. (Except that .ctd
does exaclty this, but because of a weird reason.)
I think this works.
This seems like a misguided idea to me now. Closing this until I have better things to say.
I want to be able to have
.in
-less invalid.ans
-testcases, like this: (For a problem whose output is in [0..100])Instead of
The
.ctd
and.viva
Answer Validators already do exactly this, but our definition of AnswerValidator insists on the following invocation:even if the
answer_validator
doesn’t evenopen
thetestcase.in
.I think
.in
-free invalid answer invalidatoin makes sense, is useful, and strictly increases problem quality because it allow me do state that “101
is wrong no matter what the input file is”. (I am not so concerned about saving a line of typing a redundantin: 1 1
in the generator. I’m concerned about the stronger semantics.)To do this, we must give semantics to ”what it means to run an AnswerValidator on a pseudo-testcase without
.in
”.Solution 1
Add
--input_oblivious
to the specification of AnswerValidator. Those who read.ctd
and.viva
are always input-oblivious anyway; handwritten AnswerValidators can receive this flag (which means they promise to notopen(args[1])
.) Whenbt validate
iterates over its validators, it can look for this flag in the source code, much like--constraints
.Solution 2
Non-backwardscompatibly change the invocation of AnswerValidators to always be
Then the semantics of validating a pseudotestcase is clear: If there is both
.in
and.ans
, both are sent to the validator, else only one is sent.There are probably other ways of doing this. One of the difficulties is that the
Testcase
class is very much tied toin_file
.