eriknw / dask-patternsearch

Scalable pattern search optimization with dask
BSD 3-Clause "New" or "Revised" License
21 stars 2 forks source link

Submit more than ncores tasks #2

Closed mrocklin closed 7 years ago

mrocklin commented 7 years ago

It might make sense to submit more tasks than there are workers. This would help to cover up downtime while communicating results. Workers like having a bit of a backlog to stay busy.

eriknw commented 7 years ago

I totally agree, and this should be discussed in the (non-existent) docs. What do you recommend as the default w.r.t. ncores?

The user can manually specify this with the queue_size= argument, which doesn't need to relate to ncores at all. Behaviorally, one may want to submit more (maybe many more) tasks than there are cores to "slow down" the algorithm and more exhaustively search the region, which may help if there are multiple minima.

mrocklin commented 7 years ago

From an occupancy perspective I'd recommend submitting the number of available cores +1 for every process.

ncores = client.ncores()
total = sum(ncores.values()) + len(ncores)
eriknw commented 7 years ago

What, no PR?! I like that default. Thanks.