codalab / codabench

Codabench is a flexible, easy-to-use and reproducible benchmarking platform. Check our paper at Patterns Cell Press https://hubs.li/Q01fwRWB0
Apache License 2.0
65 stars 26 forks source link

Execution Time Limit Capped at 600s #1422

Closed johanneskruse closed 4 months ago

johanneskruse commented 5 months ago

Hi,

The 'Execution Time Limit' is fixed at 600 seconds, and I cannot increase it. My competition has been running for a few weeks with the limit set to more than 600 seconds, but today my submissions started to fail.

These are two valid submissions, but from one hour to the next, they cannot exceed 600 seconds: Screenshot 2024-04-23 at 10 00 14

I am still able to decrease the 'Execution Time Limit': Screenshot 2024-04-23 at 10 50 28

Is there an explanation or a fix for this?

Link to Ekstra Bladet News Recommendation Competition

Thank you very much!

aporrasc commented 5 months ago

Hello, I am having the same problem. Any possible solution? or is it a bug? Thank you very much!

ihsaan-ullah commented 5 months ago

@Didayolo I think this is because of the latest changes

Do you think 600 seconds is not enough? MAX_EXECUTION_TIME_LIMIT=600

Didayolo commented 5 months ago

Hi @johanneskruse @aporrasc,

Indeed, we just merged the following change: #1154

It limits the execution time on the default queue (the queue of workers shared by default by all competitions), in order to avoid congestion with long jobs.

The limit is currently set to 600 seconds, maybe it is a bit too extreme?

For now:

Your feedback on this matter is appreciated in any case. What is the usual duration of the jobs in your competitions?

Didayolo commented 5 months ago

Post-it: the Execution time limit exceeded message should be more informative in the case the limitation comes from the use of the default queue.

I added a clarification in the front page message: https://www.codabench.org/

johanneskruse commented 5 months ago

Hi @Didayolo,

Thank you for getting back to me on this matter.

I understand why this is being done; however, it is very unfortunate for our ongoing competition. Our competition demands more in terms of evaluation time (approximately 3 hours), and since it is already ongoing, we cannot optimize or change the scoring program at this point.

We have set up a remote worker (thank you for the easy step-by-step tutorial!), but it is quite slow because each submission is processed sequentially on a single worker. Is there a way to utilize multiple remote workers for the same competition, or to parallelize the jobs on the single remote worker? This would be a great help!

Again, thank you for the great Codabench platform!

aporrasc commented 5 months ago

Any help on setting the compute worker?

I am able to run the compute_worker container using: _docker run -v /codabench:/codabench -v /var/run/docker.sock:/var/run/docker.sock -d --env-file .env --name computeworker --restart unless-stopped --log-opt max-size=100m --log-opt max-file=3 codalab/competitions-v2-compute-worker:latest in Shell.

A couple of questions raise: 1) Should there be something in a local folder called "codabench"? 2) I submit a test but never starts. I am using the same Broker URL. image

Any easy-to-follow tutorial?

Thanks in advance!

ihsaan-ullah commented 5 months ago

@aporrasc

  1. Have you created a queue and used its broker URL in the compute worker .env
  2. can you check the logs of your compute worker if it is running?
  3. when you submit a submission, do you see any changes in the compute worker logs?
aporrasc commented 5 months ago
  1. Yes.
  2. I checked. The computer is ready. When I submit something to test the competition a problem raised due the SSL certificate. It seems that cannot verify the SSL certificate (it is the typical problem for companies with strong security).

I have to say I put "BROKER_USE_SSL=True" in the .env file.

Any advice?

Thanks.

aporrasc commented 5 months ago

Finally, I got it.

But even now in my own computer as compute worker the time limit is set to 600 seconds.

Where can I modify this? Becuase I only know one way to do it, in the competition.yaml.

Thanks.

ihsaan-ullah commented 5 months ago

You can change the time limit in your phase settings

Screenshot 2024-04-26 at 2 31 34 PM
Didayolo commented 5 months ago

@johanneskruse Sorry for the inconvenience for your competition, and thank you for your understanding.

You absolutely CAN setup several workers in order to parallelize the computation. You can simply follow the exact same procedure on several computers (VMs or physical machines). Use the same BROKER_URL, so the different workers will be listening to the same custom queue (see schema).