Kattis / problemtools

Tools to manage problem packages using the Kattis problem package format.
MIT License
105 stars 72 forks source link

Interactive problem: pipes can fill up #113

Open ghamerly opened 5 years ago

ghamerly commented 5 years ago

I'm not sure if this is something that can/should be handled by problemtools, or not, but it may be good if it could be handled there.

An interactive validator can get blocked on a write to standard output if the submission it is evaluating is not consuming that information. The pipe connecting the two programs fills up. When the interactive validator gets blocked, that can cause a judge error, because the validator does not get to exit cleanly.

Here's a scenario where this happened to me. In a loop, the submission makes one query to the validator, and the validator replies with two lines of response. The problem is such that a "reasonable" strategy is for the submission to just submit a bunch of random queries and ignore all responses. The validator gets stuck when trying to reply to a query.

This can be handled from within the validator by using O_NONBLOCK on standard output, and catching any errors that come from trying to write to a blocked file. But as I wrote initially, if there is some way the interactive validator could handle this issue, it would be nice.

simonlindholm commented 5 years ago

I guess the judge error problem appears when the submission exits nicely, but the validator gets walltime limit exceeded due to this behavior? If so, the judge error is probably the result of the pipe closing logic added in #77:

     * We never close the read end of the validator -> user channel -- it only
     * serves to give the grader Judge Error if it doesn't setup up a signal
     * handler for SIGPIPE, and we do want submissions that exit early to be
     * accepted.

looks like we traded one judge error for another...

One solution would be to just give TLE/idleness limit exceeded in that case, just like I'd expect to happen if the submission doesn't exit nicely (e.g. because it spams the validator with queries which deadlock). I agree that it's not optimal though, and having it magically work would be better.

Another solution would be to have an IO proxy process in between validator and submission, but that doubles the time overhead for interactive problems.

Let's try not to require validators to handle signals in weird ways, that's a recipe for making different interactive problems behave differently. For O_NONBLOCK in particular I'm not even sure how I'd expect a validator to handle such a failure.

ghamerly commented 5 years ago

I guess the judge error problem appears when the submission exits nicely, but the validator gets walltime limit exceeded due to this behavior?

That sounds reasonable.

If so, the judge error is probably the result of the pipe closing logic added in #77:

To be clear, I am using the release version of problemtools, i.e. before #77 was merged. So I don't think the JE is caused by the new code.

One solution would be to just give TLE/idleness limit exceeded in that case, just like I'd expect to happen if the submission doesn't exit nicely (e.g. because it spams the validator with queries which deadlock). I agree that it's not optimal though, and having it magically work would be better.

I think what the correct judgement is is debatable.

Another solution would be to have an IO proxy process in between validator and submission, but that doubles the time overhead for interactive problems.

Yes.

Let's try not to require validators to handle signals in weird ways, that's a recipe for making different interactive problems behave differently. For O_NONBLOCK in particular I'm not even sure how I'd expect a validator to handle such a failure.

Right.

austrin commented 5 years ago

To be clear, I am using the release version of problemtools, i.e. before #77 was merged.

The latest release (which is what I assume you mean by "the release version") was made after #77 was merged.

I have also been pondering the idea of having an IO proxy between validator and submission, as it would also allow us to "record" and log the interaction done. I agree that it adds some IO overhead but (a) I/O usually isn't a bottleneck for interactive problems, and (b) I would consider this proxy to be a wrapper around the validator (i.e. the time spent would be taken from the validator's (generous) time budget and not affect the submissions).

Another quick mitigation/bandaid for this issue would be to bump the buffer size of the pipes (I think the default is 64 KiB and bumping it to say 1 MiB is probably pretty risk-free?)

simonlindholm commented 5 years ago

(a) I/O usually isn't a bottleneck for interactive problems

It also increases context switches. It's actually a somewhat real problem -- there are interactive problems like https://boi18-day1.kattis.com/problems/boi18.worm that require hundreds of thousands of queries to become interesting.

(b) I would consider this proxy to be a wrapper around the validator (i.e. the time spent would be taken from the validator's (generous) time budget and not affect the submissions).

For sure. It's just a question of walltime overhead.

Another quick mitigation/bandaid for this issue would be to bump the buffer size of the pipes (I think the default is 64 KiB and bumping it to say 1 MiB is probably pretty risk-free?)

That sounds great.

ghamerly commented 5 years ago

The latest release (which is what I assume you mean by "the release version") was made after #77 was merged.

Ah, sorry, my mistake; I am using a version that does indeed have #77 merged.

eldering commented 5 years ago

I have also been pondering the idea of having an IO proxy between validator and submission, as it would also allow us to "record" and log the interaction done. I agree that it adds some IO overhead but (a) I/O usually isn't a bottleneck for interactive problems, and (b) I would consider this proxy to be a wrapper around the validator (i.e. the time spent would be taken from the validator's (generous) time budget and not affect the submissions).

I'm currently working on implementing this in DOMjudge, because logging the submission output is crucial for debugging and the jury's understanding.

Another quick mitigation/bandaid for this issue would be to bump the buffer size of the pipes (I think the default is 64 KiB and bumping it to say 1 MiB is probably pretty risk-free?)

It's 64KiB in the Linux kernel by default, but can be increased via /proc/sys/fs/pipe-max-size and fcntl(), see https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer. I think that that's a good backup solution, indeed.

Finally, I would say that a submission with this behavior should not trigger any verdicts except AC/WA depending on whether it gives the right output.

niemela commented 5 years ago

Finally, I would say that a submission with this behavior should not trigger any verdicts except AC/WA depending on whether it gives the right output.

👍

austrin commented 5 years ago

I've updated the interactive runner in problemtools to bump the pipe size to 1 MB. (@ghamerly the test submission in the problem you sent me still fills up the pipe and blocks, but it gets a few test cases further than it used to so... progress...?)