Closed mzuenni closed 7 months ago
Here is a toy program to test invalidation (and constraints): pickanumber.zip
It has several validators:
pickanumber> tree *_validator*
answer_validators
└── answerformat.py
input_validators
├── invala
│ ├── validate.cpp
│ └── validation.h
└── invalb
├── validate.cpp
└── validation.h
output_validator
└── validate.py
5 directories, 6 files
You can run
bt generate --no-validators
bt validate --v --invalid
Here is the current output, with parallelisation switched off:
Much of this look ok. Note, e.g., that testcase neg_b
passes the first input validator (invala
) only to be rejected by the other invalb
, which is therefore green. Testcases named actually_valid
should trigger red lines; they were (deliberately) misplaced into invalid_
in order to check that invalidation works. Every validator is run on every testcase at most once.
But, while we’re here:
I am not sure why invalb
is run on too_many_tokens
, there is no reason for that (invala
just rejected it, as expected.)
Also, for invalid_outputs/just_plain_wrong
, both input and answer validation are expected to pass, so those lines should be green (not yellow).
When the (singular) output_validator accepts invalid_outputs/actually_valid
, it should be red, not yellow. The preceding three lines should be green, not yellow (their behaviour is expected.)
Looks good to me.
correctly verify which of the
.in
.ans
and.out
files inside theinvalid_*
directories should be rejected