Closed lidongke closed 5 years ago
pasted from #385
If i have 20 CPUS, 10 envs, 5 sessions, then it will starup 50 process with 20 CPUS, i consider it is not reasonable, am i right ? how can i fullly and reasonable use my CPU resource?
resource allocation with the search module works like so:
ray
detects how many CPUs you have, say 20 in total. This is the compute budget.search.py
is per trial. If a trial has 3 sessions, it assigns 3 CPUs. So, it can assign 6 trials to utilize 18 CPUs within its budget.All these assume that environments are very light on CPUs and RAM (memory). If your environment is requires more resources, you can limit the number of trials ran in parallel by changing the line with a multiplier to allocate more per trial:
# either scale by ur num envs if it dominates the resources
multiplier = ps.get(spec, 'env.0.num_envs') * 0.2
# or simply set a convenient number for your use case
multiplier = 4
num_cpus = min(util.NUM_CPUS, meta_spec['max_session'] * multiplier)
Then, a trial with 3 sessions will use 12 CPUs, and so only 1 trial gets ran at a time. This will allocate the budget more reasonably for your use case
I advise u to expose "multiplier" to out , so that we can more reasonably use more envs that not light.
good suggestion. will do so hopefully sometime this weekend
Hi~ I see your src code : search.py ''' num_cpus = min(util.NUM_CPUS, meta_spec['max_session']) ''' if my spec, Hyberparameter like this: "num_envs": 20 "max_trial": 10 "max_session": 3
this wil running 6 trials at the same time, And 4 trials are pending,right? It seems that CPU resource is ok ,but there has 60 process , my CPU memory has it limit, it will possible occured out of memory error and crash.
Is there some measures to prevent this or if this should i watching the cpu memory by myself to manual control my process numbers in my spec?
Is there only consider about "max_session" but not "num_envs" is reasonable?
@kengz