DOMjudge / domjudge

DOMjudge programming contest jury system
https://www.domjudge.org
GNU General Public License v2.0
709 stars 248 forks source link

ways to make domjudge faster #232

Closed sameersaini closed 7 years ago

sameersaini commented 8 years ago

i am using domjudge to compile and run codes. th problem than iam facing is that it takes too much time to compile and run codes especially in java.

the stats that i gathered is that on 4 core system the time taken to produce response for 4 correct codes run at a same time with 18 testest cases for each code was 12 seconds and the system utilization was at 100%. and this time keeps on increaing with the number of compiles with correct code.

when i analize where the time is being spent so much, i found that the combined time taken in the testcase_run.sh is before calling runguard and after calling runguard is more that the time taken by runguard process itself in running the testcase, which seems quite strange to me.

so what i was thinking is knowing the reason of precisely why you guys put that code in .sh file when the same thing can be achieved by putting that code in the php file?

looking for a fast and prompt reply.

meisterT commented 8 years ago

I don't understand what you mean with: "this time keeps increasing with the number of compiles with correct code."? We only compile the code once per submission/judging.

Can you show us please the judging start/end times and sum of runtimes for one of these submissions. You should fine a line like this Result: correct, Judgehost: judge2-10-0, Judging started: 16:02:04, finished in 00:06 s , max/sum runtime: 0.93/1.76s on each submission page. Please copy it.

sameersaini commented 8 years ago

i mean that time keeps on increasing with the number of submissions with correct code written by the candidate/team. i have attached a file showing the timeanalysis done by me for c cpp and java. time is in milliseconds.

this shows the time taken in compiling running testcases and camapring results for a submission.

this stat is obtained by running 1 correct submission on a 4 core machine.

what i have analysed is that as the number of correct submissions increases the the overall time i.e. compile + testcases run + compare also increases. so when i ran 4 correct submissions( having 18 testcases each) in parrallel at same time on a 4 core machine., the cpu utilisation was at 100% and the overall time taken to produce output for each submission was 10 seconds in case of java.

well this time reduces if i increase the number of cores. so i did my analysis on a 32 core machine also,and i found that when i run 24 correct submission in parallel at same time on a 32 core machine the overall time per submission was 16 seconds in case of java and per core utilization was at 100%. which in fact is also large.

are you clear now, what i want to optimize.

capture

meisterT commented 8 years ago

Are you saying that a) the overhead increases with the number of correct submissions of the same team (although evaluated before), or b) the overhead increases with the number of active judgedaemons judging on the same machine in parallel?

I cannot believe a), while b) is more probable. Did you disable hyperthreading and turboboost? If you have N cores after disabling hyperthreading, I would run only N-1 judgedaemons.

Do you run cgroups?

Please show us the line that I requested in my previous post for both situations (low and high overhead).

sameersaini commented 8 years ago

well currently i am only using judgehost part of domjudge, so i think i wont be able to provide you that line, but the stats that i have collected is by calculating time directly from the codefiles, so this time is for sure correct.

To answer your question : the overhead increases with the number of active judgedaemons judging on the same machine in parallel

and i am using AWS servers, so i am afraid hyperthreading and turboboost is available or not.

also i would like to know the number of parallel judgings you would recommend on a N core machine at a same time. my analysis shows that performance degrades a lot when i run more than N parallel judgings on a N core machine.

TPolzer commented 8 years ago

also i would like to know the number of parallel judgings you would recommend on a N core machine at a same time. my analysis shows that performance degrades a lot when i run more than N parallel judgings on a N core machine.

That depends on what you want to achieve:

Since judgings also judge timing, you should run at most N-1 judgedaemons, each pinned to one cpu that is not part of the general load balancing (with the linux isolcpu paramter). If you do this, carefully inspect your CPU, most modern CPUs have turbo/powersaving features that are global per socket (not per core), for AWS, see e.g. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html. Even if you do not judge in parallel, you definitely want to disable those features, since they depend on average usage patterns over time.

On the other hand you could argue that one judgedaemon per CPU socket is the possible maximum, since all cores typically have shared last level cache and memory.

If your programs perform a lot of IO you might even consider running at most one judgedaemon per machine.

eldering commented 7 years ago

I'm closing this issue, as there is no indication of a bug in DOMjudge and various options to optimize performance have been suggested.