WIPACrepo / pyglidein

Some python scripts to launch HTCondor glideins
MIT License
7 stars 20 forks source link

Add queue handling #23

Closed samary closed 8 years ago

samary commented 8 years ago

Add by queue requirement Add IIHE local cluster CPU config file

briedel commented 8 years ago

You can configure customs header options yourself, see

https://github.com/dsschult/pyglidein/blob/master/configs/parallel.config#L19

It may be better to have an option max_ram_per_core and run separate instances for each queue to accommodate this.

samary commented 8 years ago

Maybe I can do that but I thought other sites could have this use case. I just wanted to share my proposal to avoid running multiple useless instances. This is working as I want for now (good queue balancing) so if you have another idea how to improve my local setup, let me know. You can close this PR if you don't think it worth it. (I'll keep my implementation on my site for easier maintenance).

briedel commented 8 years ago

We may make changes that break your things. We will want to keep things inside submit.py as cluster agnostic as possible. I am not sure why running three cron jobs is that much more a hassle than one. We are already splitting the CPU and GPU submission on some clusters into to instances.

samary commented 8 years ago

I run them as daemon (delay option) for now. Maybe going through cron might help to reduce the work to track down states. Ok, let's do that.

samary commented 8 years ago

I ran multiple instances : one for each queue. But the behavior is not the one intended. By thinking further, it won't work as expected since the mem_per_core attribute is never used to skip jobs, just to scale CPUs up. So an instance with mem_per_core = 2000 will only multiply CPUs by RAM requested and not skip this job. Or another instance might have enough RAM to not multiply CPUs but will never see this job (already launched by the previous instance). So we have just a concurrency between instances. I suggest to implement a max_mem_allowed parameter to skip some jobs to let another instance have the chance to run it.

dsschult commented 8 years ago

max_memory would be good. similarly, max_* for each resource would be useful. basically, what the maximum slot size is.

samary commented 8 years ago

So if I set max_memory = 2000 and mem_per_core = 2000, I won't get jobs bigger than 2000mb and so no multicore jobs on this instance ? We also noticed that even 5000mb jobs are never using this amount of ram, is the memory requirement a bit overscaled or we just missed big jobs ?

dsschult commented 8 years ago

That's the idea.

And yes, iceprod v1 has a problem with requesting too much memory for the average job. It's due to the mean being pulled up by outliers.

samary commented 8 years ago

Another question is about the safety margin : this lead to have always more requested memory than the original job did. As far as I understood, the first requirement (from RPC) is used to scale CPUs regarding the configuration, but if you request more than set in configuration, this means you will always request more than configured. Couldn't we move this safety margin to the RPC level ? This will be more coherent in the process (request 2000mb => get 2000mb and not 2100mb) IMHO.

dsschult commented 8 years ago

The safety margin is actually for HTCondor matching considerations. It works on powers of 2, so request_memory=2000 will actually only match >= 2048 MB. Thus, we put in a safety margin over 2000 to make sure the slot has enough memory to correctly match.

samary commented 8 years ago

Ok, I understand why you cannot move this requirement upper. So I think we should calculate and use this safety margin upper in the job matching (in client.py and not in submit.py). So the whole configuration take this margin in account (I mean, if I say 3000 in the configuration file, it is really 3000 and not more). This will also help for the max_memory discussed previously.