pc2ccs / pc2v9

Version 9 of the PC^2 Programming Contest Control System
Eclipse Public License 2.0
46 stars 23 forks source link

Provide support for repeated TLE executions #128

Open clevengr opened 3 years ago

clevengr commented 3 years ago

Feature Description: With the advent of increasing use of PC2 in a "virtual contest" environment (that is, where PC2 modules run "in the cloud" instead of on hardware provided "on-site"), there is an increased potential for inaccurate timing during execution of submissions. For example, two consecutive submissions may end up executing on AutoJudges which run on "virtual machines" with unknown different characteristics -- one of which runs faster than the other due to cloud operation overhead, for example.

This issue is not likely to be huge for most submissions -- but the one place it could be significant is with submissions that fail due to "Time Limit Exceeded" (a program might have failed on one VM whereas it would have succeeded on another VM or even on the same VM if run at a different point in time when the cloud resources weren't as loaded). In particular, the thing which "kills" a submission execution when it hits the "problem time limit" is an instance of PC2 class ExecuteTimer; there is potentially a lot of variability both in the JVM overhead for invoking the actionPerformed() method of the ExecuteTimer and for the execution of the code within the actionPerformed() method.

One solution to this problem could be to provide support for multiple executions of submissions which fail due to "TLE". For example, if a submission receives a TLE, it could be executed (say) five more times, and the actual elapsed time for each of the executions could be compared. If the variance between these times is small (say, within 10%?) then it's likely that all the executions received "equally fair" amounts of time, and therefore the submission really did hit a "time limit exceeded". If on the other hand there is wide variance in the actual run times, it's likely that some of the executions suffered more from system overhead than did others -- meaning the submission might have been able to execute within the time limit on a less-loaded machine.

Do you have any specific suggestions for how your feature would be **implemented in PC^2?**

It remains an item for discussion what to do if the variance is NOT within the Tolerance.

clevengr commented 3 years ago

An approach similar to what's described in this issue was apparently tried at this year's "IOI (International Olympiad in Informatics)" contest. Here's a rough description of what they did:

So basically their strategy was "if it got a TLE, but it did stop and did so within 150% of the Time Limit, give it several more opportunities to make it under the time limit". While this doesn't guarantee the elimination of platform variances as a factor, if you run it enough times it's highly likely to do so. (The question of course is, how many times is "enough times"? They felt that 5 times with 150% margin was sufficient...)