dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.57k stars 715 forks source link

Workloads where graph topology depends on computed values/ dynamic graph topology #3514

Open JJsrc opened 4 years ago

JJsrc commented 4 years ago

This is a design question, potentially asking the same question as https://github.com/dask/distributed/issues/1663

For some use cases (finance for my example), the work of figuring out the computing graph topology can be complex/expensive. Usually this translates into the final graph topology being dependent on some node(s) being computed , which today can be handled either all on the client side, or in dask using the secede/rejoin api https://docs.dask.org/en/latest/futures.html#submit-tasks-from-tasks

However using this api feels quite unnatural, and using it requires all "seceded"/"paused" work to remain in memory (and occupy a thread). Same (and worse for memory usage) if all done on the client, with the added issue of data moving back and forth between client/cluster.

Have other people felt the need for a better way to handle dynamic topologies?

mrocklin commented 4 years ago

Is there a particular concern that you have with client-side control, using worker_clients, calling compute within a function, or some of the other mechanisms?

Is there something that you think would work well?

This issue seems pretty open ended right now, so it's not clear to me how to proceed. Perhaps you can give more detail about what you're after here.

On Sun, Feb 23, 2020 at 3:25 AM Jean-Jacques Bouzaglou < notifications@github.com> wrote:

This is a design question, potentially asking the same question as #1663 https://github.com/dask/distributed/issues/1663

For some use cases (finance for my example), the work of figuring out the computing graph topology can be complex/expensive. Usually this translates into the final graph topology being dependent on some node(s) being computed , which today can be handled either all on the client side, or in dask using the secede/rejoin api https://docs.dask.org/en/latest/futures.html#submit-tasks-from-tasks

However using this api feels quite unnatural, and using it requires all "seceded"/"paused" work to remain in memory (and occupy a thread). Same (and worse for memory usage) if all done on the client, with the added issue of data moving back and forth between client/cluster.

Have other people felt the need for a better way to handle dynamic topologies?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/3514?email_source=notifications&email_token=AACKZTE4UGXJJDZHG7QZ6OLREJMKVA5CNFSM4KZZ2DWKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IPRSUPQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTF34SRT7BM5YH7BFZLREJMKVANCNFSM4KZZ2DWA .