SEP-TU-e-2024 / JudgeQueuer

The main system running on the Judge Queuer
MIT License
1 stars 0 forks source link

Fix concurrency issues in the Judge #28

Closed TPGamesNL closed 2 months ago

TPGamesNL commented 3 months ago

There are some issues with the judge and concurrency. First, the requirements: all judge requests (for submission) arise from different threads. The main concurrency problem is in the JudgeVMSS class, although the other two (AzureEvaluator and JugeVM) also have some issues (all in azureevaluator.py). That is because they were all designed without concurrency / multithreading in mind, and they do have some state that they rely on.

This state includes judgevmss_dict in AzureEvaluator, judgevm_dict in JudgeVMSS and tasks, free cpu and free memory in JudgeVM. These values should not be corrupted (i.e. made invalid) despite concurrent access.

Some issues that may arise (this is likely an incomplete list):

Some of this stuff can be used with locks (mutexes), but you should make sure you don't lock too much. Obviously, when actually submitting to a VM and waiting for results, the lock should not be acquired. Furthermore, when waiting for a new VM to be created, which can take around 5 minutes, the lock should not be acquired.

Currently, there's a branch called submit-lock in the JudgeQueuer repository. This branch adds a single change, which is to have a lock around the whole submission process of a VMSS. However, this makes it so that the process is very inefficient, as it only allows one submission at the same time (or at least, per machine type).

rijkman commented 3 months ago

yappa yappa yappa