dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.57k stars 718 forks source link

Show subclass example in Adaptive docs #4816

Open elaineejiang opened 3 years ago

elaineejiang commented 3 years ago

It'd be helpful to have a skeleton subclass of Adaptive in the documentation that users can copy conveniently. Right now, there are examples for subclassing Cluster but not Adaptive: https://docs.dask.org/en/latest/setup/adaptive.html?highlight=adaptiv#distributed.deploy.Adaptive

Something like:

from distributed import Adaptive

class MyAdaptive(Adaptive):
    async def workers_to_close(self, target: int):
        """
        Get list of workers to close.
        """

    async def recommendations(self, target: int) -> dict:
        """
        Make scale up/down recommendations based on current state and target
        """

    async def target(self):
        """
        Determine target number of workers that should exist.
        """

Happy to create a PR for this.

fjetter commented 3 years ago

PRs for documentation are always welcome!

On top of an actual example, I personally would also be interested to know in which circumstances one might want to subclass Adaptive since this is also not discussed in the docs right now. Feel free to open a PR without that particular information, though.

elaineejiang commented 3 years ago

Sorry for the delayed response! I've found that Adaptive doesn't work as well when the functions I'm submitting are labeled as "unknown tasks". At work, we use a lot of internal functions that have highly variable durations. Instead of estimating task duration, I subclass-ed Adaptive to scale based on the number of unblocked tasks. I recently gave a short talk on this extension at Dask Summit: https://zoom.us/rec/play/3jXhd0X69egba6uXWXsYhrBlNTKef2-J3dTX0Hr0j15NOU-RteQcple[…]g.1623702769220.3b25454921a97baf96ba551741201890&_x_zm_rhtaid=511 (skip to 00:10:42). Let me know what you think!