dask / dask-jobqueue

Deploy Dask on job schedulers like PBS, SLURM, and SGE
https://jobqueue.dask.org
BSD 3-Clause "New" or "Revised" License
234 stars 143 forks source link

Document how to use book multiple cores for a single task #231

Open guillaumeeb opened 5 years ago

guillaumeeb commented 5 years ago

This is to clarify one of the outcome of #181.

Several people have asked how they can use dask-jobqueue to submit multi-threaded tasks. There are two answers currently:

This needs to be documented.

@lesteve made up some examples: see https://github.com/dask/dask-jobqueue/issues/181#issuecomment-449341161.

djhoese commented 5 years ago

Would this also apply to computing dask arrays (client.compute)? For example, if I have a dask array that is the result of many array operations that perform best on a threaded scheduler by themselves, is there an easy way to say that this array and all of its pre-tasks should be computed on a single worker? It's possible the scheduler is smart enough to keep the data on the same node anyway.

guillaumeeb commented 5 years ago

@djhoese you can specify a worker name from client.compute (http://distributed.dask.org/en/latest/api.html#distributed.Client.compute), but I'm not sure resources can be used as a kwarg. Feel free to ask a question in distributed tracker or on Stack Overflow.

guillaumeeb commented 5 years ago

As per the OP, the resources kwarg solution, seems to have some issue yet. See #230 or https://github.com/dask/distributed/issues/1851.

You cannot use adapt() or scale after tasks submission.